Big Data in Public Administration: Rewards, Risk and Responses
By Paul Daly (University of Cambridge)
In April 2019, I was at the Socio-Legal Studies Association Conference at the University of Leeds, presenting a work in progress, “Artificial Administration: Administrative Justice in the Age of Machines”. In this post, I explain my interest in this important issue. A version of this piece was initially posted at the Administrative Law Matters blog here. Readers interested in the human rights implications of big data are encouraged to review the University of Essex Big Data and Human Rights Project (‘HRBDT’).
We live in the era of Big Data, a term that “triggers both utopian and dystopian rhetoric”,[i] as it carries a “kind of novelty that is productive and empowering yet constraining and overbearing”.[ii] And we are now entering the world of artificial administration, where governmental bodies will replace or displace human decision-makers with information technology. A clash of value systems is or will soon be upon us, between technologists who insist on the glories of correlation and lawyers who refuse to yield on the time-honoured fundamentality of causation.
Artificial administration is the “sociotechnical ensemble” that combines technology and process “to mine a large volume of digital data to find patterns and correlations within that data, distilling the data into predictive analytics, and applying the analytics to new data”.[iii] My term evokes the replacement or displacement of human decision-makers by automated procedures. Techniques such as algorithms, neural nets and predictive analytics certainly have some place in the machinery of government, but clarity is needed about the settings in which they can properly be used.
Under the general rubric of “machine learning”, sophisticated algorithms, neural nets and other technologies carry great potential to improve governmental decision-making, by using automated systems to reduce cost,[iv] and, further, by reducing or removing the influence of fallible human beings, beset by various cognitive biases, on decisional processes.[v] The proposition that technology can improve the lot of the human race is not fanciful either, as technology makes possible new modes of deliberation and engagement, which can be assisted by government decision-makers committed to human flourishing.[vi] Artificial administration seems destined to play a key role in day-to-day public administration in the twenty-first century.
Indeed, use of information technology has long been a feature of governmental decision-making. Section 2 of the Social Security Act 1998 already made express provision for decision-making by computers (albeit those for which an official was responsible).[vii] Automation of systems is also familiar.[viii] Algorithms, too, are nothing especially mysterious – they are, after all, no more than recipes, specified steps to be taken to arrive at a desired conclusion. Primary school places in Cambridge are allocated by algorithm, uncontroversially and straightforwardly.[ix] Leveraging new technologies to further improve automated and algorithmic governmental decision-making, decreasing cost and increasing efficiency is an attractive proposition.
As is now notorious, however, machine learning is deliberately opaque. Paul Dourish describes three layers of opacity: first, trade-secret protection, which ensures that private parties’ algorithms are shielded from public view, second, that “the ability to read or understand algorithms is a highly specialized skill, available only to a limited professional class”; third, with machine learning, no one can “reveal what the algorithm knows, because the algorithm knows only about inexpressible commonalities in millions of pieces of training data”.[x] Indeed, “as these systems become more sophisticated (and as they begin to learn, iterate, and improve upon themselves in unpredictable or otherwise unintelligible ways), their logic often becomes less intuitive to human onlookers”.[xi]
When such techniques are used in a coercive fashion,[xii] to interfere with individuals’ rights, interests and privileges the stakes are self-evidently high: “It is the capacity of these systems to predict future action or behavior” based on machine learning techniques which is “widely regarded as the ‘Holy Grail’ of Big Data”,[xiii] but the use of such “anticipatory governance” means that “a person’s data shadow does more than follow them; it precedes them, seeking to police behaviours that may never occur”.[xiv]
Even when these techniques are used non-coercively, to gather information for the purposes of informing governmental decision-making, the implications may be troubling.[xv] First, in general, information technologies “carry the biases and errors of the people who wrote them”.[xvi] As such, if the data inputted is flawed, the outcomes will also be flawed.[xvii] Second, those who are experts in machine-learning techniques will have much greater knowledge than elected representatives and members of the general public about how artificial administration operates. Whereas, while the internal workings of, say, the legislative process are extremely complex there is no reason that an outsider armed with a pot of strong coffee and a copy of Erskine May cannot master the minutiae. Machine-learning experts will not necessarily include some (and certainly not all) of those tasked with operating systems of artificial administration; they could well form a separate and superior caste.[xviii] Third, the phenomenon of automation bias is now well recognised: individuals taking decisions on the basis of recommendations produced by a computer are more likely to follow the recommendations than to exercise independent judgment.[xix] Even if humans are kept ‘in the loop’ in theory, they may nonetheless become reliant in practice on technologies which, of course, they may not even understand. Fourth, the privacy implications of acquiring the information to feed into systems of artificial administration are obvious[xx] and, on some views, deeply troubling.[xxi]
Weighing up risks and rewards requires governmental decision-makers to engage in a difficult balancing act, to profit from the efficiency and accuracy gains of artificial administration whilst avoiding or minimising the potential costs of sophisticated information technology systems. Human are hardly powerless in this balancing act. At both the collection and analysis stage choices have to be made by humans about the framework for the use of Big Data and artificial administration.[xxii] There are “design choices” and they are “reflective of normative values”,[xxiii] as data is “never raw but always cooked to some recipe by chefs embedded within institutions that have certain aspirations and goals and operate within wider frameworks”.[xxiv]
Artificial administration does not stand alone: these technologies fall to be woven into existing norms of administrative law and administrative justice; they may, too, disrupt these norms. The norms of administrative law and administrative justice are not carved on tablets of stone. Just as they have evolved over the centuries they may well evolve again. Indeed, they may evolve in response to and in light of new norms, introduced by artificial administration. Even within governmental decision-making structures, the effects of artificial administration are likely to be uncertain. Managers can introduce technological modes of decision-making but are unlikely to be able to control the exercise of these modes by front-line decision-makers, who “can use information about that process available in the system to learn more about the process” and maybe even “learn more than their manager”.[xxv]
There is thus a serious question as to whose norms will ultimately triumph, or at least, shape the other more than they are shaped by it – to put the point provocatively, those of the lawyers, or those of the technologists? Make no mistake about the potential for a clash of value systems. Central to administrative law and administrative justice are norms of justification, transparency and intelligibility.[xxvi] In recent decades, courts and commentators alike have emphasised the virtues of a culture of justification in governmental decision-making, and there has been a strong focus on “the justice inherent in decision making”[xxvii] and “those qualities of a decision process that provide arguments for the acceptability of its decisions”.[xxviii] By contrast, Silicon Valley has been dominated by believers in solutionism;[xxix] notwithstanding the “philosophically tortuous” relationship between data and knowledge,[xxx] “Big Data is portrayed as a gradual evolution of the possibilities that now exist to interconnect different data sources situated on multiple geographical scales, and to process and analyse the hence generated data in increasingly automated ways”.[xxxi] Causal relationships between individuals and events – central to governmental decision-making – are displaced or replaced by reliance on correlation:
This is a world where massive amounts of data and applied mathematics replace every other tool that might be brought to bear. Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.[xxxii]
Dogmatism would certainly be inappropriate in responding to the need to strike a balance between risk and reward, as it is more likely that the competing systems – of causation/justification on the one hand and of correlation/computation on the other – will bend rather than break.[xxxiii]
Paul Daly is a Senior Lecturer in Public Law at Queen’s College, University of Cambridge, and soon to be Professor of Law at the University of Ottawa, Canada.
[i] Danah Boyd and Kate Crawford, “Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon” (2012) 15 Information, Communication & Society 662, at p. 663.
[ii] Hamid Ekbia et al, “Big Data, Bigger Dilemmas: A Critical Review” (2014) 66(8) Journal of the Association for Information Science and Technology 1523, at p. 1539.
[iii] Karen Yeung, “Algorithmic Regulation: A Critical Interrogation” (2018) 12 Regulation & Governance 505, at p. 505. See similarly Paul Dourish, “Algorithms and their Others: Algorithmic Culture in Context” (2016) (July-December) Big Data & Society 1; Karen Yeung, “‘Hypernudge’: Big Data as a Mode of Regulation by Design” (2017) (20) Information, Communication & Society 118, at p. 119.
[iv] See e.g. Guy Kirkwood, “The Government’s Big Opportunity”, Reformer Blogs,22 March 2017 (https://reform.uk/the-reformer/governments-big-opportunity).
[v] See e.g. Taylor Owen, “The Violence of Algorithms” in Taylor Owen, Disruptive Power: The Crisis of the State in the Digital Age (Oxford University Press, Oxford, 2015),at p. 175; Cass Sunstein, “Algorithms, Correcting Biases” (2018), at p. 3.
[vi] See e.g. Joseph Heath, Enlightment 2.0: Restoring sanity to our politics, our economy, and our lives (Harper Collins, New York, 2014); Clive Thompson, Smarter Than You Think: How Technology Is Changing Our Minds for the Better (Penguin, New York, 2013).
[vii] Andrew Le Sueur, “Robot Government: Automated Decision-Making and its Implications for Parliament” in Alexander Horne and Andrew Le Sueur eds., Parliament: Legislation and Accountability (Hart Publishing, Oxford, 2016).
[viii] Karen Yeung, “Algorithmic Regulation: A Critical Interrogation” (2018) 12 Regulation & Governance 505, at p. 518: “simple forms of algorithmic regulation display continuity with long established approaches to control that seek to harness architecture and design to shape social behavior and outcomes…”
[ix] My children’s school has the following admission criteria: “1. Children in care, also known as Looked After Children (LAC). 2. Children living in the catchment area with a sibling at the school at the time of admission. 3. Children living in the catchment area. 4. Children living outside the catchment area who have a sibling at the school at the time of the admission. 5. Children living outside the catchment area who have been unable to gain a place at their catchment area school because of oversubscription. 6. Children living outside the catchment area, but nearest the school measured by a straight line. In cases of equal merit in each set of criteria, priority will go to children living nearest the school as measured by a straight line”.
[x] “Algorithms and their Others: Algorithmic Culture in Context” (2016) (July-December) Big Data & Society 1, at pp. 6-7.
[xi] The Citizen Lab, Bots at the Gate: A Human Rights Analysis of Automated Decision-Making in Canada’s Immigration and Refugee System (Toronto, University of Toronto, 2018), at p. 11. See generally Frank Pasquale, The Black Box Society: The Secret Algorithms That Control Money and Information (Harvard University Press, Cambridge, 2015).
[xii] For example, the Canada Border Services Agency “uses a Scenario Based Targeting (SBT) system to identify potential security threats, using algorithms to process large volumes of personal information (such as age, gender, nationality, and travel routes) to profile individuals”. The Citizen Lab, Bots at the Gate: A Human Rights Analysis of Automated Decision-Making in Canada’s Immigration and Refugee System (Toronto, University of Toronto, 2018), at p. 21.
[xiii] Karen Yeung, “Algorithmic Regulation: A Critical Interrogation” (2018) 12 Regulation & Governance 505, at p. 509.
[xiv] Rob Kitchin and Tracey P Lauriault, “Towards Critical Data Studies: Charting and Unpacking Data Assemblages and their Work”, in Jim Thatcher, Josef Eckert and Andrew Shears eds., Thinking Big Data in Geography: New Regimes, New Research (University of Nebraska Press, Lincoln, 2018).
[xv] See especially Karen Yeung, “‘Hypernudge’: Big Data as a Mode of Regulation by Design” (2017) (20) Information, Communication & Society 118, at p. 122, p. 130.
[xvi] Taylor Owen, “The Violence of Algorithms” in Taylor Owen, Disruptive Power: The Crisis of the State in the Digital Age (Oxford University Press, Oxford, 2015), at p. 169.
[xvii] See e.g. Commonwealth Ombudsman, Centrelink’s automated debt raising and recovery system, April 2017, Appendix B, noting the shortcomings in data used to collect overpayments of income support; Melissa Hamilton, “The Biased Algorithm: Evidence of Disparate Impact on Hispanics” (2019) 56 American Criminal Law Review (forthcoming).
[xviii] John Danaher, “The Threat of Algocracy: Reality, Resistance and Accommodation” (2016) 29 Philosophy & Technology 245.
[xix] Linda Skitka et al, “Does Automation Bias Decision-making?” (1999) 51 International Journal of Human-Computer Studies 991.
[xx] See e.g. Karen Yeung, “Algorithmic Regulation: A Critical Interrogation” (2018) 12 Regulation & Governance 505, at p. 514: “Continuous, real-time surveillance is critical to the operation of all forms of algorithmic regulation, whether reactive or predictive in orientation”.
[xxi] Danah Boyd and Kate Crawford, “Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon” (2012) 15 Information, Communication & Society 662, at p. 672.
[xxii] See generally Rob Kitchin, The Data Revolution: Big Data, Open Data, Data Infrastructures & their Consequences (Sage, London, 2014).
[xxiii] The Citizen Lab, Bots at the Gate: A Human Rights Analysis of Automated Decision-Making in Canada’s Immigration and Refugee System (Toronto, University of Toronto, 2018), at p. 11.
[xxiv] Rob Kitchin and Tracey P Lauriault, “Towards Critical Data Studies: Charting and Unpacking Data Assemblages and their Work”, in Jim Thatcher, Josef Eckert and Andrew Shears eds., Thinking Big Data in Geography: New Regimes, New Research (University of Nebraska Press, Lincoln, 2018).
[xxv] Andrew Burton-Jones, “What Have We Learned from the Smart Machine?” (2014) 24 Information and Organization 71, at p. 77. See e.g. Jennifer Raso, “Unity in the Eye of the Beholder? Reasons for Decision in Theory and Practice”, Public Law Conference 2016, at p. 12. Though this can lead in turn to reactions from management, which is “likely to turn to another tactic – leveraging the informating potential of an IT not for learning and improvement but for control and enforcement”. Id.
[xxvi] See e.g. Dunsmuir v New Brunswick  1 SCR 190, at para. 47.
[xxvii] Michael Adler, “Understanding and Analyzing Administrative Justice” in Michael Adler ed., Administrative Justice in Context (Hart, Oxford, 2010), at p. 129.
[xxviii] Jerry Mashaw, Bureaucratic Justice: Managing Social Security Disability Claims (Yale University Press, New Haven, 1983), at pp.24-25.
[xxix] Evgeny Morozov, To Save Everything, Click Here Technology, Solutionism, and the Urge to Fix Problems that Don’t Exist (Allen Lane, New York, 2013).
[xxx] Hamid Ekbia et al, “Big Data, Bigger Dilemmas: A Critical Review” (2014) 66(8) Journal of the Association for Information Science and Technology 1523, at p. 1528.
[xxxi] Francisco R Klauser and Anders Albrechtslund, “From Self-Tracking to Smart Urban Infrastructures: Towards an Interdisciplinary Research Agenda on Big Data” (2014) 12 Surveillance & Society 273, at p. 273.
[xxxii] Chris Anderson, “End of Theory: The Data Deluge Makes the Scientific Method Obsolete”, Wired Magazine, June 23, 2008. See also Viktor Mayer-Schonberger and Kenneth Cukier, Big Data: The Essential Guide to Work, Life and Learning in the Age of Insight (John Murray, London, 2013), at pp. 8-9: “The era of big data challenges the way we live and interact with the world. Most strikingly, society will need to shed some of its obsession for causality in exchange for simple correlations: not knowing why but only what. This overturns centuries of established practices and challenges our most basic understanding of how to make decisions and comprehend reality”.
[xxxiii] This is a guess, admittedly, but an educated one, based on the considerable and considerably entrenched power of law on the one hand and technology on the other.