…and the architects, designers, data scientists, and developers will think we are nuts
I’ve been driven back to the blog to talk about one very specific aspect of privacy, data protection and Artificial Intelligence (exchange for Machine Learning or Algorithms as appropriate). Or, in plainer English: mathematical equations driving conclusions about us and decisions that can impact our individual or collective rights. This, I’m going to argue, is the frequently ignored link in the AI supply chain: the part where non-specialists are required to apply the product of that artificial labour.
This is what got my dander up: Technologists doubting that people will buy into algorithmic or intelligent machine outputs without objectively checking the workings out…or at least giving it a best efforts sniff test. Kind of depends if it’s Explainable AI (yep, that’s an actual term, abbreviated to XAI – Here’s David Gunnings writing for DARPA on that) as opposed to the sort of decision making we can’t really unpick. They know this stuff grows from filthy dirty data and a patchwork of variably borrowed and kludged code. The whole thing stitched together by programmers who consciously and unconsciously stir in personal, societal, and corporate views. No-one’s denying it will enable incredible feats of data wrangling and generate insights backed by more evidence than ever before, but right now they know it’s more often an elephant doing circus tricks than the family dog.
(By Alexandra Kleeman, November 20, 2016, for the New Yorker)
Bit opaque on that last bit? Yeah, thought so. What I meant was that AI (used more literally this time) is almost always dependent on behind-scenes humans to demonstrate what it could practically achieve, rather than going off and actually achieving it. It’s still too error prone and impacted by the on-going struggle to nail the concept of efficient learning. The staged performances help firms to push back against ‘progress-averse’ concerns and secure continued investment.
The main limitation on realising whole swathes of advertised benefits is not, as you might think, funding. The stunt-monging, thought-leadering, and visionarying has done it’s job. After the initial wave of investment enthusiasm petered out for want of useful, monetisable outputs the hype regrew and you will now be hard pushed to find a blue chip firm that doesn’t have a big chunk of development budget bet on bigger data and advanced versions of mathematical means to manipulate it. No, the main limitation on generating many projected benefits is the quality, richness and consistency of available data. A challenge that spawned the term “Dark Data” and a million startups aiming to help with the data discovery and cleanup, or means to collect a better quality data product. The latter is best noted in the unstoppable expansion of the Internet of Things.
By Jessica Bursztynsky, Monday June 10th 2019, for MSNBC
If you sensed a certain bullying edge in the push to acquire the next smart speaker, you probably weren’t imagining it. A commensurate bullying edge accompanied lots of commercial partnerships to embed technology into more homes, public spaces, and everyday things, plus the same again for related merger and acquisition activity. A co-ordinated attempt, as the digital trust pendulum swings away from obvious players, to offset worry against new bling.
Consider the headline space given to privacy by PR firms working for those with most control over the means of information production. That’s ample evidence of how much controlling this narrative matters.
There’s also, despite anti-monopoly mumblings, a synergy with traditional businesses and policy makers. There’s heavy reliance on the the ubiquity, distribution and capabilities of their data handling infrastructure, no matter how much powers that be doth protest. Not least, as I talk about later, in their drive to regulate our online space. Most (but not all) don’t want that policing buck to stop with them.
But what is it that really concerns me about the current and potential fallout? Is it just ‘think of the children’ Luddite posturing dressed up as social concern? Is it privacy professional blinkers that have pushed me into an ‘us’ position against the tech and surveillance ‘them’? Is it just me having a bad day and spewing bile for chucks? Nope, none of the above. It’s the harm I’ve historically seen done in the name of knee jerk rule making and almost two decades living in or around for-profit tech change, supply chains and associated governance work. Time spent pragmatically balancing the successful on-going operation of an organisation with legal, regulatory, and ethical privacy considerations on behalf of those to who get to own the risks.
Grand designs and governance gaps
The principles here are not new, no matter how synaptical and quantum they sound. We have always tried to scale and standardise decisions made about people and about things that can impact them. Data based decision making isn’t a buzzphrase, its standard practice for anyone who likes to work on something better than a hunch. Accounting data, stock market performance data, staff performance data, exam results, risk assessments, every piece of scientific research known to humankind…well, a decent chunk of it.
All of that, since computers were invented, leveraged digital ability to do sums quicker, at greater scale and more reliably.
So what changed? Very simply: the number of humans it is possible to influence, the amount of data available to do so, the small number of firms who hold keys to both the data and influence and the lack of fit for purpose rules and rule makers.
Let’s revisit that rather arbitrary list again. This time, while reading through, consider the checks, balances, and oversight measures you’d expect to see. Things like checks on consistency of scoring, sampling to ensure minimal levels of false negatives and false positives, reconciliation against sources, auditing of protective controls, tests to see if results can be reproduced, control groups to make sure results aren’t caused by something other than what you are testing for.
The many measures in place have evolved over many many years. All designed to make sure we can trust conclusions while minimising (by admittedly very variable yardsticks) harm. Harm that places like the Alan Turing Institute are trying to better define in context of new technologies.
Are historical controls perfect? Heck no. Do all parties agree with how much rigor is enough? Pfft. Has broken culture and bad practice driven detours down blindly biased alleys? Ohhhh yeah: for example see Caroline Criado Perez’s book Invisible Women, about the amount of historical research that failed to factor in females and Racial Bias in Pain Perception and Response (PDF) by Vani A. Mathur, Jennifer A. Richeson, Judith A. Paice, Michael Muzyka, and Joan Y. Chiao published in May 2014 in the Journal of Pain. But a reasonably well regulated tail of prescribed practice is in place to try and make outputs as fair, accurate, and transparent as possible, and bodies are mostly there to support a challenge where it’s not.
However, for the purpose of this post, I’m most focused outside the lab. I’m interested in the work that starts after the academic papers are published: the proofs of commercial and public sector concept, the early adopter implementations, the first rollouts at scale and all the hidden and hugely time consuming steps to obtain, clean and label data for machines:
By Tristan Greene, 12th June 2019, for The Next Web
We can argue about fairness and utility of the aforementioned checks and balances until we are blue in the face, but can you confidently argue there are no specific gaps to close for nascent AI? The kind of gaps that layer additional bias and missed implications onto the historical baseline.
In working to configure and educate machines to yield marketable vs merely interesting results, data scientists are leveraging huge swathes of existing scientific knowledge and massive lakes of acquired data, but they are not, in the main, applying the same scientific standards of practice and ethical oversight (assuming that earlier academic oversight, debatably influenced by AI vested interests, was an adequate fit).
In the same way we inherit inaccuracies and omissions from one data set to the next, so too our governance failures reverberate and amplify through every link in the development and supply chain
Now consider this:
Childhood factors influencing vulnerability to radicalisation, social media flags for mental illness, instagram evidence of antisocial behaviour, facial features harvested from Facebook that correlate with non-heteronormative traits, speech patterns suggesting pre-disposition to addiction, internet activity indicating you are trying to get pregnant, or investigating how to end a pregnancy, evidence of political suggestibility derived from purchase histories provided by Facebook / Twitter / Instagram / Google / Amazon and all of their retail-wide beacons, trackers and cookies.
Here’s Johnny Ryan from Pagefair on the transparency, fairness, and resulting legal issues that last example can cause:
…every website or app with ads has dozens if not hundreds of third-party vendors collecting data or depositing cookies, including tag management systems, ad platforms, content management systems, user analytics vendors and the like.
That means user data is collected and passed around…it’s inevitable there is “data leakage,” where your personal data gets shared or employed by some third-party vendor — or by a vendor working with that third-party vendor — in some way the user didn’t approve
From “Consent is unworkable for programatic ads in the in the era of GDPR“, an interview with Johnny Ryan by Barry Levine on January 11, 2018, for Martech Today.
What do you know about checks, balances, and oversight measures for those kinds of things?
Did you know that your offline purchase history, interaction with state institutions, your gmail and the rest of your online life provides the raw materials for this? Did you expect your kids’ school to quietly hand over attendance, illness, immigration status, and special needs data to government bodies? (Check out Defend Digital Me to see if all that happened in the UK). Did you stop to think that the business case for consumer genetic testing was always to build the best data set to sell to big pharma or other interested parties? Did you know that every time you hit the web an unseen auction happens within milliseconds to decide which firm gets to add new data points to an aggregate profile of your virtual self and serve you up real-time ads. The upgraded profile then packaged, priced and sold to anyone interested. Perhaps to tailor cartoon character socks that pop up on your Instagram feed, or perhaps put together with another data set to inform more concerning conclusions.
By Amy Williams, 12th June 2013, for The Telegraph
It used to be the government who had the biggest book of data about citizens and residents, but now the biggest and richest lakes of data are in private hands (turning a temporarily blind eye to countries that struggle to separate their legislature, executive and judiciary from all of their private firms). That list of data oligopolists isn’t, as you might think, always topped by the tech firms, that honour went to Walmart in a recent survey, but the interconnectedness of many of these entities creates a very tangled web.
Of course there will be versions of due diligence. Of course there are technical and organisational controls. Of course there’s a degree of discretion about sharing data with third parties. But all that is largely self-defined and self-certified as fair and fit for purpose against a benchmark highly influenced by commercial and political interests.
Any individual who has worked to implement controls in a firm driven by financial results or political agendas knows that you win some, but you can lose many more. In a contest for money, time and appropriately skilled bodies, ‘Should we…?’ almost invariably loses to ‘Can we…?’ rapidly followed by ‘When will we?’ And when capability is in place and bumping up the valuation, or other desirable progress, those who’s remuneration and wellbeing depends on last quarter results will fight tooth and nail to keep the juggernaut rolling.
It takes a persistent and dramatic weight of well publicised collateral damage to start to apply any brakes, but that’s only where a connection between the cause of fallout and the fallout itself remains clear and that separation of powers is working. Much of the risk linked to harmful decisions is transferred to the person who’s data you used and the person who buys the product or service. That might be obscured by a string of processors as data meanders up, down and across supply chains. All of which makes duly diligent change and meaningful governance a significant challenge.
Not evil, just free – free data, free rein, freedom to fail fast and leave things minimally viable before testing them on you and me – free the way all unchecked markets tend to be.
Can the industry self-police and if not, regulation at what cost?
Ignore the individual and cumulative gotchas that slip through the assurance net for a moment. I’m a firm believer that people are overwhelmingly more good than bad and only a small minority are designing solutions at obvious risk of causing us harm. Or rather, only a small minority are aware they have a hand in designing solutions at risk of causing us individual or collective harm (Chinese walls and NDAs aren’t just about protecting intellectual property from outsiders).
Many hope that tech workers will consistently and successfully rise up against dangerous and unethical practices, but we know, in the daily power churn, it is only a tiny minority who will blow up a career to strongly object. Here we are relying on ground floor tech company staff to boycott, strike, or resign to defend our collective moral and ethical codes until the statutory ones and those who can enforce them, catch up. That’s not to belittle those having a damn fine try, but its generally only effective where critical mass overtakes immediate risk of being professionally disappeared.
Is regulation the better option? Goodness knows bad regulation can be worse than no regulation. Cory Doctorow recently had a whole lot to say about that (includes a link to his full talk given at the re:publica conference in Germany on 7th May this year). He argues that we would be far too reliant on the biggest tech firms to provide the infrastructure needed to implement regulation. In fact his predicted upshot is gifting them a true monopoly in this policing role, which would, ironically, make them definitively too big to fail. Far better, he argues, to introduce competition and give that competition means to dilute the oligopoly without fearing corporate revenge. Quoting from his talk, in context of the EU Copyright Directive and subsequent plans for wholesale automatic content filtering:
“Almost overnight we’ve gone from an internet where speech had the presumption of innocence…to one in which all speech is guilty until proven innocent”
It’s also a hotly debated topic in context of UK government proposals to mitigate online harm as set out in their 30th April Online Harms white paper, the internet association response and the related development of a code of practice for Age Appropriate Design (PDF) published by the ICO (here’s Heather Burns – @webdevlaw – with her excellent critique of that). Then there’s the related, but apparently not coordinated ‘#NewFilter’ recommendations (PDF) from the UK’s All Party Parliamentary Group on Social Media. It’s the output from their inquiry into “Managing the Impact of Social Media on Young People’s Mental Health and Wellbeing”. Last, but definitely not least, there’s the likely rapid passage of more global, state and and federal privacy laws. The latter will lean heavily on specialist tech firms to head off development of an inconsistent privacy patchwork and enable implementation of resulting rules.
Very little of the oversight required to ensure compliance can be done without an algorithm, or a more intelligent evolution of such things. That cumulatively begs a bunch of very important and far from new questions (questions covered in depth in linked critiques): Who will be watching the watchmen? Who will define and interpret the rules? How well will they deal with false accusations and errors? How can they guard against abuse of the monitoring tools?
Computer says no
(as the David Walliams Little Britain sketch goes)
Swinging all the way back to the top, bearing in mind everything discussed so far, can you now tell why folk in the know will think we would be stupid to blindly trust an ‘intelligent’ machine? But what will they do to guard against it? It doesn’t fit the sales narrative. Solutions are purchased off the back of belief in faster / cheaper / better / more (preferably, but unrealistically, all four). It’s only after the cheque gets signed and a solution is weaving its way into production that the real limitations and dependencies come to light. Soon the purchase price looks cheap compared to the run budget necessary to audit outputs and fix. Better to resolve after release as a function of support and bank the lessons learned for next time?
In some circumstances, where use cases are very narrowly defined, the machines might hit the floor running. Initial outputs look really promising. Sensible sounding conclusions and patterns are produced, but theres the rub: What if the expected conclusions and patterns are wrong? Arguably a far greater risk to those potentially impacted than a rushed release with acknowledged kinks to iron out.
The screen feeds you the conclusion about physical ability, mental wellness, talent, sexuality, race, geographical location, religious beliefs, immigrant status, tendency towards criminality, reliability, love of dark chocolate, preference for black cats and what that means for the decision at hand. The more contentious the decision, the more harm it can cause, the more upset it will provoke, the more people on the ground will lean on automated responses. Not doing so is likely more than their job is worth.
That’s what the machine learning and AI specialists on stage didn’t get. In their intellectual and tech silo they didn’t foresee that the ultimate upshot is relatively unquestioning faith in the pitch. “But people shouldn’t take conclusions for granted” I was told in an incredulous tone “The data can be flawed. They should always check”.
Solutions will be landed on people who stand zero chance of unravelling the basis for a conclusion, even if they are given permission to try. It’s a very human tendency to want to distance oneself from stuff that can cause ructions. Look inside and tell me if you would rather inform someone that the computer says ‘incompetent’, or explain how you and colleagues sat around a table and came to that conclusion? I’ll leave it up to readers to extrapolate out to the ultimate “Just following orders” conclusion. The screenshot below links to Heather’s thread about Rene Carmille, the IBM analyst whose actions saved thousands during WWII. An example that serves to underline the earlier truth by being an awe inspiring exception to the horrific rule. Is any more evidence needed of a desire to delegate responsibility, especially when you are an expendable and low paid cog in a very large and intimidating machine?
Bringing us back to less dramatic earth: 3rd party customer service centres follow scripts for a reason. The reason is cost. Savings are achieved by a combination of efficiency, standardisation, lowering the mean level of skill required to do the job and therefore the cost of hiring personnel to do it. Do those sound like familiar drivers to airlift in some AI? Why would a firm encourage their employees to interfere with that? Why would the free market mechanisms described above, in a relatively unregulated market, encourage adequate due diligence, testing, and the right run budget to keep damaging mistakes and cumulative implications under control?
Back to daily realities:
How would you challenge an algorithmic accusation of incompetence based on an AI-powered performance management solution? How would you challenge mistaken withdrawal of benefits based purely on computer-controlled assessment of your online activity? How would you put right a machine generated mental health risk rating, based on school records that were subsumed into an NHS database, that then found its way over to a private health insurer, who’s now denying you coverage for drugs due to a new computer conclusion, dependent on the school data, about likelihood of your non-compliance with a treatment regime?
How would you even know about those calculations? How many organisations would put you through to a person with the foggiest idea about how it works? If they could or would, how many attempts do you think that would take? Even if, via the information commissioner or another public body, you make it to the manufacturer, can they even unpick the fine tuned and evolved outputs of their own digital tools?
By what factor will people who can’t challenge, won’t challenge, or don’t know what to challenge, outweigh those who can can, will and do? How does that factor change when you take on board that the most vulnerable populations will be by far the worst effected because the push to accept data collection and processing is far greater and made far harder to resist. Below is an excerpt from the ongoing saga of the Irish Public Services Card, widely viewed by data protection professionals as a breach of the human right to privacy because the potential harm caused is disproportionate to any declared public interest, especially since government bases for introduction have been changing over time.
“The bottom line is that the State cannot simply introduce measures such as the public services card, which process the personal data of the entire population, without demonstrating that it is necessary to meet a recognised public interest objective and that it is a necessary and proportionate measure when all facts are taken into account”
By Solicitor Fred Logue, September 2017, for the Irish Times
How that inequity influence time taken for mistakes and impacts to come to meaningful light? How much time and harm will pass before collateral damage makes it onto a risk register seen as actionable by the board, or an oversight body with teeth?
REMEMBER: We are not just talking about breaches, we are talking about multiple quiet mistakes, or outputs of ‘normal’ processing, that force people to be guilty until they can prove themselves innocent in the eyes of those who have been conditioned and paid to place excessive trust in inaccurate machines.
The world is changing. Snowden, the GDPR, and events revolving around Facebook and Cambridge Analytica have collectively provoked a privacy sea change that few initially had faith would stick. Awareness of the value of our virtual selves to us and others is rising. The big tech players are all pitching for influence and offering financial support to those likely to frame new privacy rules. The newly appointed data ethics bodies have great representation, especially (positively or negatively, depending on your perspective) from people with commercial and public policy skin in the game. Those raising awareness are incredibly active internationally, albeit, too often, in activist and academic echo chambers, but while there’s little prospect of accountability for harm outlasting release, sale, or a first declared data processing purpose and recourse to redress fails to meaningfully land with those accepting risks on our behalf behind closed public and private doors, I’m not, to say the very least, confident in how this will shake down.
Featured image copyright: lightwise
I am still working to get a deeper understanding of key principles and challenges around AI. For those trying to do the same with some kind of data related skin in the game, I thoroughly recommend the following books for easy to consume practical perspective:
Invisible Women, Caroline Criado Perez
Architects of Intelligence, Martin Ford
Twitter and Teargas: The power and fragility of networked protest, Zeynep Tufekci
Ethical Data and Information Management: Concepts, Tools and Methods