The motto of the 1933 World’s Fair in Chicago triumphantly reads, “Science Finds, Industry Applies, Man Conforms.” Such a motto paints a deterministic picture of science as unreflectively accumulating bits of knowledge, independent of final ends and applications. Science is depicted as an autonomous, mechanistic process rather than a conscious, human activity. But if science finds, who decides where it looks? To what extent can we collectively direct its ends, and to what ends should the resulting technology be put?
Having studied, worked, and traveled around Asia and Europe, I have seen first-hand both the positive and negative social and educational implications of digital technology in the classroom and beyond. Others, too, have recognised this potential for both good and bad as “data science for social good” initiatives are now emerging all across the world. But if data science for social good is to be done seriously, assumptions about human nature and what the “good life” might look like can’t be avoided. Yet, prominent artificial intelligence researchers have publicly complained about having to consider the ethical and social impact of their work.1https://www.geekwire.com/2020/retired-uw-computer-science-professor-embroiled-twitter-spat-ai-ethics-cancel-culture/ They see ethical questions as outside the scope of their scientific duties. The philosophical question is this:
In today’s hyper-connected, globalised world can we ever reach rational agreement on good and bad uses of technology? I think we can.
Against the prevailing notion of “value-free science,” I’m interested in finding ways to make science and technology align with basic human values. Maybe I’m old fashioned, but I think the ever-widening division of intellectual labor reflected in academia harms our ability to think big, integrate wide areas of knowledge under a single vision, and invites us to outsource to “experts” certain skills and knowledge that might otherwise lead to us having happier, more fulfilling lives.
The COVID-19 pandemic has made clear the essential role that digital technology plays in our lives. Smartphones, tablets, and computers surround us, constantly recording, processing, and predicting our behaviour, often without our conscious awareness or consent.2The Oxford philosopher Luciano Floridi writes about our new “onlife” shared with digital, transhuman “inforgs.” Our digital interactions—and the behavioural data left in their wake—are used to index, search, filter, and rank both the content and its delivery method using a suite of methods broadly known as personalisation.
Every day, new devices and techniques are being devised to collect more and better behavioural data, both in terms of greater volume and greater detail. Personalisation, as with any new technology, holds both promise and peril for persons and society. Personalisation could be used to detect phishing and fraud susceptibility, deliver personalised medicine and educational materials, or help diagnose abnormal neurological conditions, such as Parkinsons, among many other potentially socially-beneficial uses.
Yet, these goals have not made up the bulk of research on the topic. Very few commercial applications of personalisation have aimed at how they might be used to facilitate human communication or self-understanding. Instead, profit-seeking interests have dominated the conversation. Prediction, manipulation, and control are the scientific interests that have found profitable business applications3Yeung, 2017; Zuboff, 2019, especially in the wake of COVID. Streaming service Netflix recently recorded the biggest period of growth in its history.4www.forbes.com/sites/joewalsh/2020/10/20/netflix-subscriber-growth-slows-after-surging-during-pandemic/?sh=73a68355244e The massive increase in digital data collection brings with it opportunities for improvement in services, especially in the realms of marketing and advertising, which constitute nearly all the revenues of major digital platforms like Google and Facebook.
Influential technologists have recently begun to criticize the way in which major digital platforms harvest user data and use algorithmic recommendations. Twitter CEO Jack Dorsey acknowledged in an interview that he failed to consider how the app’s design would affect “how people interrelate with one another, [and] how people converse with one another.” 5https://www.nytimes.com/2020/08/07/podcasts/the-daily/Jack-dorsey-twitter-trump.html Similarly, Apple CEO Tim Cook is worried about how personalisation can be used to push dangerous conspiracy theories. In a recent conference presentation, Cook said:
“At a moment of rampant disinformation and conspiracy theories juiced by algorithms, we can no longer turn a blind eye to a theory of technology that says all engagement is good engagement — the longer the better — and all with the goal of collecting as much data as possible.”6www.reuters.com/article/us-apple-facebook-idUSKBN29X2NB
A “Critical” Attitude Towards Personalisation
As the above quotes suggest, it’s time we re-evaluated the aims and goals of personalisation. This article takes a critical attitude7Fay, 1987, p. 203 towards personalisation. The critical attitude consists of three basic assumptions about the human condition: 1) humans are typically unfree and dominated by things they don’t understand or control; 2) human existence could be otherwise; and 3) the way to make it otherwise is to increase our understanding and knowledge of that which we cannot control or understand.
The critical attitude is inherently “emancipatory,” in the sense of German philosopher and social theorist Jürgen Habermas. Understanding the basic ideas and assumptions driving personalisation can motivate and empower one to exercise one’s current rights under the EU’s General Data Protection Regulation (GDPR). More generally, my hope is that this piece may encourage readers to actively participate in public sphere discourse about the ends and purposes of personalisation technology.
What are Personalised Predictions?
At their core, personalised predictions are the output of algorithms.8Algorithms are essentially sequences of transformation rules that take input data and convert it into output data. Algorithms can range in complexity from the easy to understand (think long division) to the nearly unintelligible “black box” kind, most notably “deep” neural networks. The greater predictive accuracy of neural networks comes with a price, however: their complex inner workings are often not clear, even to their designers.
In the last decade, as both the amount of data and the availability of cheap processing power has increased, there’s been a general move towards using deep neural networks wherever possible. Black box algorithms are now the de facto standard for personalisation efforts by Google, Facebook, and other major digital platforms. Increasingly too, black box approaches are being used in areas of “high-stakes” decisions, such as parole hearings, bail, and the determination of safe levels of environmental pollution.9Rudin, 2019
The literature is not consistent about what separates personalised from non-personalised technology and methods. Basically, if a data collector or processor generates a prediction for an individual data subject, then for all intents and purposes it is a “personalised” prediction. For now, I’ll consider personalisation as a broad array of technical, prediction-based methods and approaches borrowed and extended from the fields of artificial intelligence (AI), machine learning (ML), and statistics.
We might, however, note one necessary condition of personalisation: the predictions are applied to digital representations of persons. Unsurprisingly, ideas, methods, and theories from economics and psychology are now being applied to personalisation, as it relies on human-generated behavioural big data—the micro-level data collected and stored on platforms and devices. When I say micro-level behaviours, I really mean micro:
Clicks, mouse movements, cookies, search terms, and other “trivial” digital behaviours make up the digital haystacks in which algorithms and statistical models attempt to find predictive needles.
Such micro-level human data are known as behavioural big data.10Shmueli, 2017 These data may also derive from the behaviours of others connected or related to you on social networks.
After algorithms have ingested and aggregated the behavioural big data needed for predictions, these predictions are then joined back and applied to the individual users as needed.
After algorithms rank a piece of content for you (e.g., a video, post, or link), you’ll see this content placed before everything else the next time you log on. That’s personalisation in a nutshell.
Exploration or Exploitation?
As this new blackbox, behaviour-based paradigm of personalisation has taken over industry, new methods of personalisation have become popular. Many of these methods evolved from theoretical work done during Cold War operations research and are based on fundamental results in numerical optimisation, animal learning, behaviourist psychology, game theory, and the economics of information.
The multi-armed bandit approach, for example, is a generalisation of methods in the statistical design of experiments. The name “bandit” stems from old-time slot machines in casinos. Each slot might give out different payoffs, and the discerning player’s task is to try to figure out, on the basis of limited time and gambling money, which “bandit” pays out the most. Once known, the gambler would stop playing all the others and concentrate on the most lucrative bandit (conforming to the axioms of rational choice). This process is referred to as finding an optimal balance between “explore” and “exploit” strategies. A pure exploration strategy tests or “explores” all possible slot machines; a pure exploit strategy would only play the bandit with highest known payoff, potentially missing out on other, higher paying bandits in the process.
The goal of such algorithms is to essentially treat you and your preferences as a mathematical object which exists but is unknown. Personalisation researchers often speak about behavioural data through the theoretical lens of “revealed preferences.” They assume persons make decisions consistent with an unobservable objective function which they refer to as “utility”.11Chambers & Echenique, 2017, pg. xii Only by giving you many options and observing how you interact with them can they confirm their “theory” about what the mathematical representation of your preferences might look like.
This description is not 100% accurate, though. Personalised systems do not create a specific representation of your specific utility function, but instead aggregate behavioural data among many other users to create generic groups or profiles and then, on the basis of your similarity to the group, apply the profile to you.
Assuming that recorded behaviours “reveal” preferences is flawed for several other reasons. First, digital platforms can and do modify digital environments in real time, so it’s not clear how users are actually interpreting digital environments. Second, users do not actually compare all possible pairs of actions or items as rational choice axioms assume. Users may not even perceive that some items are offered or other actions are open to them.
Nevertheless, personalised systems use the concepts of utility and preferences in a formalisation called reinforcement-learning. The science behind reinforcement learning derives from work in cybernetic control theory, animal learning, and neuroscience. Its philosophical roots trace back to the British empiricists of the 18th century, whose ideas were picked up by neo-behaviourist psychologists exploring how stimuli become associated with responses via reinforcement processes.12Dickinson, 2012 According to the subfield’s founders, there is “mounting evidence from neuroscience that the nervous systems of humans and many other animals implement algorithms that correspond in striking ways to reinforcement learning algorithms”.13Sutton & Barto, 2018, pg. 377 In particular, dopamine-releasing neurons are hypothesised to “broadcast a reinforcement signal” to areas of the brain used in both classical and instrumental conditioning experiments.14ibid. pg. 385 I believe this is what former Google “Design Ethicist” Tristan Harris means when he speaks of the “race to the bottom of your brainstem.”15https://www.commerce.senate.gov/services/files/96E3A739-DC8D-45F1-87D7-EC70A368371D
In reinforcement learning, an “agent” automatically learns from environmental feedback—which are your behavioural responses to its actions— in real time without human intervention or explicit goal setting about which specific actions to take. Metaphorically speaking, you are an alien environment and the algorithm is trying to navigate you, without any explicit help from anyone.
Unlike the basic ML paradigms of supervised and unsupervised learning, RL is considered a third paradigm because it does not require an external teacher to provide labeled data or discover underlying structure in data.16Sutton & Barto, 2018, pg. 2 By repeatedly interacting with its environment (you), the agent “learns” the optimal way to act so as to maximise long-term rewards, which are often “dwell time”, “engagement”, or “click through” in personalisation.17Dragone et al., 2019
Put simply, both multi-armed bandit and reinforcement learning approaches to personalisation seek to answer the question: is it more profitable to explore your preferences or to exploit them?
The GDPR: Key Concepts
The architects of the 2018 GDPR intended to change the way companies around the globe (“data collectors” and “data processors”) collect, store, and process the personal data of “data subjects.”
Personal data refers to “any information relating to an identifiable natural person” and data subjects are “natural living persons” like you or I. The GDPR further regulates the automated processing and algorithmic profiling of personal data. Algorithmic profiling is any kind of “automated processing of personal data used to predict a natural person’s interests or preferences” with “significant” (legal or otherwise) effects on the natural living person. Personalisation falls under algorithmic profiling.
The data subject rights listed in the GDPR were meant to harmonise with broader European ideas about human rights. The European Charter of Fundamental Rights (ECFR), ratified in the year 2000, recognises fundamental rights to privacy and protection of personal data for all persons in the “human community.” From the European perspective, data protection and privacy are tools aimed at preserving human dignity.18Lynskey, 2015 Dignity means the worth or fittingness of the human person and is crucial to the GDPR.19Floridi, 2016.20Philosophically, dignity stems from the exercise of the human capacity for conscious reasoning and reflection.
Generally speaking, the rights of data subjects under the GDPR can be categorised as related to transparency (i.e., clear and unambiguous consent, clear communication with data subjects), information and access (i.e., who collected the data and for what purpose), rectification and erasure (i.e., allowing data subjects to correct false information and delete old information), and objection to (automated) processing (i.e., removing consent to processing of any personal data including algorithmic decision-making).
Two major legal notions underlie the GDPR’s attempt to preserve human dignity in the digital age: informational self-determination and its predecessor, the right to the free development of one’s personality. Informational self-determination is “an individual’s control over the data and information produced about him,” and is a necessary precondition for any kind of human self-determination21Rouvroy & Poullet, 2009. In turn, self-determination is a precondition for “a free democratic society based on its citizens’ capacity to act and to cooperate’.22ibid. Similarly, the right to personality permits persons to freely develop their personalities so long as they do not violate the rights of others, the constitution, or the moral code.23Coors, 2010 It also encompasses various privacy rights stemming from the German Constitution (Articles 1,2), including the “right to one’s image, the right to one’s name, and the right to oppose publication of private facts” (ibid.).
In brief, human dignity is assumed to derive from the expression of these capacities to determine one’s own identity and to develop a unique personality. As modern life is increasingly mediated by digital technology, these rights must be extended to the digital realm. The GDPR does just that.
Habermas 101 – Introducing Critical Info Systems
I’m not the first to note how Habermas’ ideas might be applied to critique the use and applications of digital technology. Experts in the field of Critical Information Systems (IS) have viewed information systems as fundamentally social communication systems.24Myers, 1994 Similarly, “data” used by these systems are the result of social and technical processes that evolve over time. The influential IS scholar Kalle Lyytinnen considers data as anything that “can be brought to bear in support (as evidence) while making knowledge claims.”25Lyytinen, 2009.
The relevant question for personalisation is, who or what decides what “personal data” means? Against the traditional empiricism assumed in personalisation, data are not simply “given,” but can be further analysed and interpreted in light of their economic and technical genesis.26Habermas, 1971 Any meaningful artifact can be viewed from two perspectives: “as an observable event and as an understandable objectification of meaning,” but only the “participant” in the shared speech community has access to both.27Habermas, 1992, pg. 23 Outside observers—the impartial (data) “scientists”—are not participants in this community and thus, ironically, can only claim to partially “understand” the data.
In his 1971 book Knowledge and Human Interests, Habermas distinguishes between three different spheres of scientific interests, each associated with its own logic and methods: technical or instrumental interests (the domain of empirical-analytic sciences), practical/communicative interests (the domain of historical-hermeneutic sciences), and critical/emancipatory interests (the domain of critical social sciences). The empirical-analytical sciences are concerned with technical control and prediction —this would include the fields machine learning, data science, and engineering— while the historical-hermeneutical sciences aim at understanding through interpretation. Finally, the critical social sciences aim to “dissolve” the effects of “ideology” by self-reflection, much as a psychoanalyst helps a patient to reflect on the underlying causes of his neurotic symptoms. Freud’s concepts of the unconscious and repression play an important role in Habermas’ notion of ideology.
Habermas’ main aim is to explain how empirical-analytical sciences became synonymous with the concept of science itself by artificially separating subject (knower) from object (knowledge), and thus declaring vast areas of the “human sciences” unimportant and meaningless. Habermas argues that only through critical reflection might we realise how the empirical-analytical sciences represent just one of several possible equally legitimate knowledge interests we might pursue.
Habermas expanded these ideas in a later two-volume work The Theory of Communicative Action (published in English in 1984 and 1987 respectively) and in his 1992 Moral Consciousness and Communicative Action, which greatly impacted the field of Critical IS. I’ll briefly mention some key ideas that have taken hold in the Critical IS literature.
First, Habermas’ theory assumes that what makes humans unique is their ability to coordinate their activity through language—this is what he calls communicative action. Its purpose is to achieve and maintain shared understanding against the background of a shared “lifeworld” of history, cultural traditions and rituals. Other forms of action he describes are instrumental action aimed at achieving goals in a non-social way, strategic action aimed at influencing others to achieve one’s goals, and finally discursive action aimed at reestablishing agreement after a communicative breakdown.28Mingers, 2001
Following Karl Popper, Habermas imagines persons simultaneously inhabiting three worlds: an objective world of actual and possible states of affairs, a social world of normatively regulated social relations, and a subjective world of personal experience and beliefs.29Mingers, 2001
Habermas contends that every social utterance makes implicit validity claims relating to its comprehensibility, truth, rightness, and sincerity with respect to each of the three worlds.30Mingers & Walsham, 2010 Comprehensibility is seen as a precondition for shared communication, while “truth claims” are made with respect to the objective world, “rightness claims” with respect to the social world of behavioural norms, and “sincerity claims” with respect to the subjective world of one’s beliefs and intentions.
Participants in a given discourse can dispute and defend the validity of an utterance on the four grounds mentioned above. Basic conditions of equal participation and the ability to make assertions, express wishes and be accountable for one’s conduct make this situation “ideal”.31Benhabib, 1985 Insofar as discourses approximate the “ideal speech situation” the outcomes are considered valid.
The notion of the “ideal speech situation” plays an important role in Critical IS and is used to analyse how current information systems might be reconceived to avoid systematically “distorted communication” among users and designers.32Hirschheim & Klein, 1989; Wilson, 1997 The “ideal” aspect of the “ideal speech situation” derives from the fact it assumes no intention on the part of participants to deceive or manipulate others. Nevertheless, in this ideal situation, democratic consensus about system design emerges through strength of argument, not power or deception. The contribution of Habermas has been to envision how consensus might be possible on system objectives, designs, and implementation considerations in a way that serves the common human interests of all participants.
A Habermasian approach to personalisation thus requires active participation from all stakeholders and a willingness by informed citizens to engage in “public sphere” debates about the aims and purposes of such technology.
We’re Gonna Fight for our Right to… Privacy: Habermas & The GDPR
How would the GDPR play a role in Habermas’ idealised communicative process? The short answer is that the GDPR’s data subject rights can be viewed as protection against the “colonisation” of our “lifeworlds” by commercial interests and manipulative strategic action. Further, the rights to modify and dispute the veracity of one’s personal data can be viewed as a form of digital “discourse” by which participants (data subjects and data collectors/processors) come to mutual agreement about the meaning and interpretation of behavioural data.
The GDPR’s focus on transparency, notification, and communication serves to reduce power and information asymmetries between data subjects and data collectors that would otherwise distort communication between them and prevent joint coordination of decision-making. Digital platforms would thus lose their monopoly on interpretation.
1. Explicit Consent (Article 7)
The GDPR’s focus on explicit consent for data collection and processing demonstrates the problems associated with interpreting the meaning of behavioural data. The European Data Protection Board (EPDB) recently clarified its definition of clear and unambiguous consent to data processing so that behaviours such as swiping or scrolling do not constitute clear, explicit consent to processing.33https://edpb.europa.eu/sites/edpb/files/files/file1/edpb_guidelines_202005_consent_en.pdf Data collectors cannot simply speak for data subjects as if they were impartial observers.
2. Right to Human Intervention (“Human in the Loop”) (Article 22)
Besides allowing data subjects to opt-out of automated processing, Article 22 has two key provisions: 1) data subjects have a right to obtain human intervention; 2) data subjects can contest the automated decision (and can also access the personal data used to make the decision). Additionally, data subjects must be “informed” about any automated processes and provided with “meaningful information” about the logic of the decisions and the possible consequences of such automated-profiling. Such “meaningful information” includes the ability to “obtain an explanation of the decision reached” (Recital 71). Personalised recommendations, particularly in morally salient contexts, such as dating or job hunting, could fall under this provision. Without knowing that such profiling–even with a “human in the loop”– is occurring, and without understanding how the profiling was done, data subjects’ rights to due process may be undermined. Further, the increasing use of “black box” personalisation algorithms makes it difficult to explain how decisions are reached.
3. Right to Be Forgotten (Article 17)
The genealogy of such a law traces back to the French le droit à l’oubli (the “right of oblivion”) which allowed convicted criminals who had served their time and had been “rehabilitated” to object to the publication of certain facts about their imprisonment.34Rosen, 2011 The right to be forgotten mirrors the natural workings of autobiographical memory, in which the act of forgetting is essential in constructing a self-narrative over time.35Conway & Plydell-Pearce, 2002 Persons actively take part in constructing self-narratives to understand themselves, their behaviour, and their roles in society.36McAdams, 1996
4. Rights to Access and Modify Personal Data (Article 12)
The GDPR upholds individual subjectivity and expression in creating one’s digital representation.37Rouvroy & Poullet, 2009 Data subjects’ rights to access, delete, modify, and get a portable copy of their personal data potentially grant them the ability to both more deeply understand themselves and subjectively narrate their personal identities over time. Though the technological means for doing so are currently limited, data subjects can already modify their names, their genders, and drop nationalities.38Andrade, 2012 There are limits to these rights, however, and Recital 73 states that EU member states can make exceptions in cases of historical, statistical, and public interest research.
Communication or Control? The Future of Peronalisation
In a world devoid of human values, the motto “Science Finds, Industry Applies, Man Conforms” may indeed be true. However, as I have tried to show, the GDPR gives data subjects foundational legal tools to shield against the scientific and technological determinism endorsed by major digital platforms.
The communicative and interpretive nature of persons must be safeguarded in the digital age. To this end, the GDPR provides a basic suite of rights that give data subjects interpretive resources—drawn from their respective lifeworlds—to make sense of their digital identities and behaviours. The hope is that our digital identities may arise more from communication and negotiation rather than one-time platform stipulation. Once aware of this possibility, our digital conformity becomes a choice, not a destiny.
Much remains to be done, though. First, to exercise these rights, data subjects must first know about them.
More needs to be done to educate ordinary people about GDPR rights and how they relate to data collection and personalisation.
Second, new technical solutions will need to be created to enable the exercise of these rights. For example, when getting a copy of one’s data, what format(s) should it be in? And if exercising one’s right to be forgotten, what happens to the models trained using your data? In an important recent development, the USA’s Federal Trade Commission (FTC) recently ruled that the models built from users’ personal data also had to be deleted if the data were collected deceptively.39https://www.ftc.gov/news-events/press-releases/2021/01/california-company-settles-ftc-allegations-it-deceived-consumers It’s not clear what the GDPR says on this issue.
We cannot “solve” the problem of ethical AI as we can a mathematical problem. This is because we cannot agree on an appropriate representation of “fairness” or “justice” or any concept with an inherently moral dimension. Yet, as Habermas argues, the moral and social cannot be separated, and to the extent we can approximate the ideal speech situation, the outcomes of communal decision-making can achieve moral legitimacy. That might be the best we can do in today’s globalised, multicultural world. Right now the discussion is certainly not ideal as the strategic and manipulative uses of machine learning, driven by advertising profits, prevent equal participation in digital self-determination.
In closing, I’ll mention a few potential rays of hope in the area of AI ethics. Human Centered AI40Shneiderman, 2020. A similar proposal has been made by Stahl et al. (2020) seeks to deploy AI only when doing so uniquely contributes to human flourishing and well-being. Some repetitive tasks might best be automated by AI, while those deemed important for human functioning are left alone. Other researchers have developed value sensitive design41Friedman, 1996 and designing for human rights42Aizenberg & van den Hoven, 2020 approaches which aim to balance consideration of moral values with technical specifications in a practical way.
If we follow Habermas in the belief that human nature is inherently social, interpretive, and communicative, then making and sustaining a coherent digital self-narrative is a uniquely human capacity and essential to our human dignity. We cannot blindly automate or outsource this capacity to others. The GDPR therefore represents the first step of a long journey towards human self-understanding in the digital age. As further EU legislation, such as the Digital Services Act43https://ec.europa.eu/digital-single-market/en/digital-services-act-package. See also the European Commission’s Democracy action plan and worries about algorithmic manipulation: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52020DC0790&from=EN, looms on the horizon, time will tell whether such steps are big enough.