Ethics at the Core of a Data-Driven World

There is a famous Dilbert cartoon from a few years ago in which Dogbert advises Dilbert’s company that they can generate revenue from the information they hold about their customers, but first they have to dehumanise the enemy by calling them data. This pretty accurately summarises one approach to data management that has pervaded the early years of this Information Revolution. However, as the tools and technologies for gathering, analysing, and acting on information become increasingly powerful, we find ourselves facing a tipping point in our love affair with these technologies. This tipping point is all the more pronounced as we consider the impact of data-driven processes on democratic processes and human rights around the world.

The question of ethics in information management is often conflated with the challenges of managing data privacy, particularly in an increasingly interconnected information landscape. However, privacy is the entry point for any meaningful discussion of ethical issues in information management. When we begin to look at the various ethical issues that arise in the implementation of ‘big data’, we see that the real privacy issue is not simply the potential loss of privacy and individual agency in an age when we are transparent to the algorithms, but rather the issues that arise when we must trade off privacy against other issues or benefits. If information should be processed to serve mankind (as Recital 4 of the General Data Protection Regulation tells us), as we dig deeper into the ethical issues, we find further questions of ethics and ethical conduct that impact on that fundamental principle of ethical information management.

Big data raises ethical questions

In Chapter 3 of our book, Ethical Data and Information Management, we look at examples such as the ethical questions raised when the tools for big data analytics can only run on technology that is affordable in the First World, a problem which has led one data scientist and blogger to explore the potential for what he calls “Cheap Data Science”. The ethical question here is simple: is it fair that the future, to paraphrase the science fiction author William Gibson, “is here, but not yet evenly distributed”, and that the very people who might best benefit from improved data analytics of issues such as soil erosion or the spread of disease, cannot because of the barrier to entry created by the bias about system performance and network capabilities that developers living in affluent Western economies have baked into the design of the very technologies that data analysts in developing countries would benefit from having. Is it ethically responsible or sustainable to design software and tools that only work reliably in wealthier developed nations?

We also look at the potential benefits and harms of granular tracking and microtargeting of students at university level, in which the prevailing mindset of ‘more data is better’ has lead to the development of technologies that analyse and predict student behaviour, performance, and potential to drop out. However, there is every reason to believe that the headline success stories are simply describing correlation rather than causation. This raises additional ethical issues in the data-driven world where success stories are often not subjected to the rigorous scrutiny that they should otherwise be subjected to. In the case of the burgeoning EdTech sector, the unanswered question that needs to be addressed is whether the investment in technologies to track students and their performance and their interactions with course work is the cause of higher grades and better performance as claimed, or whether students who would perform better and get higher grades are attracted to courses that have these cutting-edge facilities available? Is the relationship described causation or correlation?

Furthermore, even if there is a causal relationship, there has been limited research on the potential downsides of this type of invasive student tracking. The research that has been done raises concerns about the impact on pedagogic methods in universities, and also raises concerns about student privacy and chilling effects on independence of thinking and expression among students, and on the choices that students or parents might make about course selection or their academic performance.

Ethical concerns of algorithmic bias

The issues of algorithmic bias in artificial intelligence (AI) also give rise to ethical concerns, particularly when the questions of the inherent bias in training data are taken into account. While these algorithmic processes are often hailed as beneficial to society through time and cost savings, often they come with a hidden cost. For example, in Chapter 4 of our book we look at the problems with systems like COMPAS, a sentencing support system used in the U.S. courts system, which journalists at ProPublica found to be ‘remarkably unreliable’ in its predictions. White defendants were nearly half as likely to be flagged for potential risk of reoffending as African-American defendants and the sentences recommended tended to be longer.

The question of how we train AI systems is, in and of itself, an ethical choice. In many respects when we are developing AI systems we are acting as parents, imparting values and supporting the development of ways of thinking about issues and inferring facts from the available data. The quality of the models we develop is directly influenced by the quality of the models we are developing from. In the example of COMPAS, a likely root cause of the inherent bias in the system is, for want of a better expression, the inherent bias in the system. Historical court rulings and case studies were used to train the AI. Historically, certain ethnic groups have faired better or worse in the US criminal justice system. Similarly, facial recognition machine learning inherits biases from the images used to train it.

Other aspects of algorithmic bias are more subtle in their societal impact. When women tend to be shown lower-paying job adverts and hiring algorithms replicate similar results, this is an undesirable social and societal effect that raises ethical questions of fairness in the work place. The emergence of Lethal Autonomous Weapons Systems raises the potential for armed conflicts to become more commonplace as the risk to human life (on the combatant side) will be reduced through the deployment of autonomous weapons platforms.

Of course, it is not all doom and gloom. There are also many examples of the use of machine learning, AI, and data analytics to support and enhance the human condition. Companies such as Microsoft and Accenture have demonstrated smartphone based assistive technologies for the visually impaired which can assist with a range of tasks and narrate information to the user. Development such as these have the potential to significantly benefit the lives of countless people and are impossible without the analytics, machine learning, and AI technologies that are at the cutting edge of our data-driven world. However, there are open ethical questions to be resolved. For example, implementing facial recognition into these technologies might mimic the human eye and brain identifying a familiar face in a crowd. But the location of that processing and matching and who else has access to the biometric data of the people that you know and want to recognise becomes a balancing act. After all, with an assistive technology the matching will not happen inside your skull but in a web-based service hosted either on a device or in a cloud environment.

Integration of ethics into information management

In the Information Revolution we are generating, capturing, and processing an increasingly wide array of data about people, products, locations, events, and the relationships between them. As the potential to impact the lives of people in increasingly subtle but significant ways continues to increase, it is essential that we put an ethical core at the heart of the data driven world. This does not, however, mean we need to invent a new discipline or develop a ‘new ethics’. Many of the questions we struggle with today have been discussed for many thousands of years. The philosopher Martin Heidegger famously extolled the need to recognise technology as a means to an end, not an ends in and of itself. Immanuel Kant equally famously exhorted us to treat people as an ends in and of themselves, not simply a means to an ends. Plato decried the impact of new technologies on the way knowledge is codified and imparted (he was talking about the development of writing, but that was the ‘big data’ of its day).

What we need to do is to incorporate fundamental ethical concepts and principles into the defined data management disciplines we already have. Anything else would simply be reinventing the wheel. The integration of ethics into Data Governance, its influence on Data Quality management, and the need to explicitly recognise the motivation for data processing activities are all key issues that need to be addressed. Appropriate mechanisms need to be introduced into organisations to ensure an effective alignment between the Ethic of the Individual, the Ethic of the Organisation, and the Ethic of Society. After all, if the Ethic of the Organisation is to consider all customers nothing more than “data”, then it will be difficult for any individual to act outside that ethical frame, even when the organisation is at odds with the Ethic of Society and is being lambasted for failing to control fake news and other issues.

Data ethics must be core

Organisations that succeed in addressing this challenge will develop a strong sustainable competitive advantage as they will find it easier to attract and retain both staff and customers. Organisations that don’t will find themselves becoming increasingly counter to the Ethic of Society as they find they can only attract and retain staff on the more extreme negative end of the Ethic of the Individual. With higher staff turnover and a higher chance of being blind to an ethical crisis, such organisations will ultimately fail.

Another consideration is the pace at which legislation lags technology innovation at times. Legislation and regulation are usually only considered when the Ethic of Society has been so outraged by the actions of an organisation that legal sanctions are imposed or legislation for such sanctions is introduced. By adopting and adapting to an ethical information management culture, organisations will stay ahead of the requirement for regulation by simply striving to do the right thing in the context of Society, not just the Organisation.

The tipping point is now. Those organisations that put Ethics at the core of their data-driven world will reap long term benefits.