Phenomenal World

Phenomenal World

July 3rd, 2019

The Politics of Machine Learning, pt. II

—   Maximilian Kasy

The Politics of Machine Learning, pt. II

—   Maximilian Kasy

 
 

For the first part of this article, click here.

Common differentiation

The uses of algorithms discussed in the first part of this article vary widely: from hiring decisions to bail assignment, from political campaigns to military intelligence.

Across all these applications of machine learning methods, there is a common thread: Data on individuals is used to treat different individuals differently. In the past, broadly speaking, such commercial and government activities used to treat everyone in a given population more or less similarly—the same advertisements, the same prices, the same political slogans. More and more, everyone gets personalized advertisements, personalized prices, and personalized political messages. New inequalities are created and new fragmentations of discourse are introduced.

Is that a problem? Well, it depends. I will discuss two types of concerns. The first type, relevant in particular to news and political messaging, is that the differentiation of messages is by itself a source of problems. Often discussed under the header of (filter) bubbles and political polarization, this concern doesn’t require any “ordering” of political messages, in the sense that some messages are inherently better for the recipient than others—the question is whether it is bad that different people are coming to live in incommensurably different (political) worlds.

The second type of concern is about fairness, discrimination, or equality. This type of concern requires that some treatments are better than others. In the examples above, for instance, lower prices, a greater chance of being hired, and a lower chance of being incarcerated are usually deemed preferable. The question is whether it is unfair that different people are being charged different prices, or have a different chance of being sent to jail while on trial.

Filters and bubbles

The amount of news items or pieces of political information that we could get to read is, for all practical purposes, infinite. Any method of delivering news or information is therefore by necessity selective. And any selection is necessarily ideological: Whose death matters enough to be reported? What social or environmental issues are brought to our attention? Which corruption scandals deserve discussion?

Selectivity is nothing new. Any traditional way of delivering news or political messages is selective. What is new with algorithmic selection is that the results are highly individually customized. Much of the customization is performed by search engines and social networks—platforms that aim to maximize revenue from ad clicks. Customization is also done by political parties aiming to maximize effective vote shares.

Filter logics

Various logics drive the customization of our news feeds. Content is presented to internet users in the form of ordered lists, and since, in practice, we only ever see the top of these lists, the ordering constitutes a selection of what we get to see.

The simplest way of organizing these lists is by time of posting, with the most recent posts at the top. But on the major social networks, the posts you're shown are restricted to those posted by your connections, or by the pages you follow.

Search engines rank pages by potential relevance to your search. The most famous method of doing so—popularized by Google—is to create page ranks, where pages receive a higher ranking if other webpages link to them, and greater weight is given to links coming from pages that have high rank themselves.

The ordering on more recent implementations of social networks and search engines tends to be considerably more complex. A great number of variables specific to the reader, the poster, and the content itself are used to predict the likelihood of engagement, and posts are ordered and selected accordingly.

Bubbles and poles

These various filtering logics contribute to an individually specific version of the world, constructed via selection. To the extent that your connections and the pages you follow have a similar worldview to you, this results in self-confirming visions of the world.

Filtering by predicted engagement results in a self-confirming bubble: Any time you click on a link, share or like a post, that information is recorded, and you are then more likely to see similar posts in the future—and less likely to see different posts.

It is often argued that filter bubbles contribute to political polarization and the breakdown of communication across political camps. There is merit to the observation that the news content seen by different people is drifting apart because of these selection effects. But skepticism is due regarding the argument that this is a prime explanatory factor of recent political developments. Older voters, less likely to receive their news and political information online, are more polarized in their political opinions than youth. And the bad old cable news channels, which do not rely on any individual customization, might have at least as much to contribute to polarized (and bigoted) worldviews as search engines and digital social networks.

Disenchantment

Let’s move on from the filtering of online news to the targeting of political campaigns. What are the consequences of such targeting? First, only a small subset of the electorate is actively targeted by campaigns. This is already true traditionally, due to the intricacies of electoral systems such as the one in the US. Disproportionate consideration has long been given to swing states and swing districts, where majorities are narrow. Data-driven targeted campaigning, however, massively compounds these effects of the electoral system. To be a valuable target of political resources, one must not only live in a marginal electoral district, but most also be predicted to be on the margin of voting for one party (candidate) or another, or on the margin of voting or not voting. Additionally, those who are targeted receive individually customized messages, rather than partaking in a common narrative.

It seems certainly plausible that this individualization of political messages contributes to a general disenchantment with electoral democracy. Many voters are ignored because they are irrelevant for winning the election. And those voters who are not ignored might perceive that they are fed whatever they want to hear, by a data driven machinery designed to maximize votes, and that their neighbors might be fed different messages as seen fit by this machinery. Such voters might be forgiven for questioning the authenticity and trustworthiness of the messages they receive. (Interestingly, a conscious choice was made by the Bernie Sanders campaign to forgo the use tailored messaging, sticking instead to the same talking points and arguments for everyone. Consistency helped the campaign's arguments seem more credible and authentic.)

Targeting of this kind, then, might maximize votes individually and in the short run, but could backfire in the aggregate and in the long run; if not for the campaigners themselves, then at least for electoral democracy at large.

Fairness and equality

Thus far, I have discussed concerns about targeting, driven by machine learning, in the arena of news and politics. Concerns in this arena are primarily not about unfair differentiation between individuals, but about the fragmentation of political discourse.

Let us now move on to concerns in the arena of the economy and the penal system, which focus on questions of fairness and justice in the differentiation between individuals.

Transparency

In Kafka’s novel The Trial, the protagonist Josef K. is arrested by unidentified agents from an unspecified agency for an unspecified crime. He never finds out what he is accused of, nor when his trial is taking place, but he ultimately accepts his execution.

Machine learning algorithms are similar to the hidden bureaucracy of the The Trial. In many cases they are like a black box, with no human knowing how they came to their predictions or decisions. Suppose your CV is automatically rejected when applying for a job, or you are denied insurance, or you are jailed while awaiting trial. Often these decisions are made by a machine, based on past data. In practice you have no way of knowing which data about you went into the decision, and how these data influenced the decision.

To remedy this situation and introduce at least some degree of “due process” would require algorithmic transparency. For important decisions, one should have a right to know what information was used to make these decisions, and how the decisions depend on that information. There should also be a right to contest inaccurate information.

The unfairness of targeting

Transparency is a fairly minimal requirement, stating a right to know how unequal treatment came about. Inequalities in treatment themselves might be considered unfair, however, even when they are transparently generated.

Algorithms can be criticized on the basis of various notions of fairness. First, prediction algorithms might systematically reproduce pre-existing biases against certain groups. Algorithms that assess job-candidates, for instance, might discriminate against women, since in the past women were less likely to be promoted for that job.

Second, it is considered discriminatory to use certain variables—like race and gender—in the decision process at all. A judge, for instance, should not be able to treat a Black defendant differently from a white defendant with a similar biography and criminal history.

Third, even if certain variables such as ethnicity or gender are excluded from prediction, algorithms might still treat the corresponding groups systematically differently. With enough other data, it is fairly easy for an algorithm to “guess” someone’s ethnicity or gender, and to implicitly base decisions on this guess.

Fourth and finally, it might be argued that fairness demands the same treatment of everybody not only between groups, but also within groups. Why should some people go to jail and others not, based on crimes they haven’t yet committed? And why should some people pay more than others for goods they purchase, just because they need them more?

These four criticisms are based on increasingly stronger notions of fairness. The development of technology pushes us to question each notion of fairness and move to the next: Even if algorithms get predictions right, they will discriminate; even if explicit discrimination is prohibited, algorithms are able to discriminate implicitly; even if equality across groups is enforced, within-group inequality might persist.

Widening the frame

These notions of fairness take a rather narrow frame: Is it justifiable that different people in a particular situation are treated differently? It is advisable to take a step back and instead ask a bigger question: What impact does differentiated treatment have on social inequality more broadly, and how do we evaluate this impact?

What “big data” and “machine learning” do to economic and social inequality depends upon how new inequalities in treatment align with pre-existing inequalities. If the new algorithms are less likely to hire women, more likely to send black defendants to jail, or to charge poor people higher prices, then they amplify pre-existing inequality. But we could conceive of the opposite happening as well—if unequal treatment were negatively correlated with pre-existing inequalities. It might happen for instance that targeted prices will be lower for poorer individuals, because their ability to pay for various goods is smaller. It is hard to say which way this will play out, and the answer will be different across settings. It is all the more important, then, to stay vigilant and to observe how newly generated inequalities in treatment correlate with old inequalities, and how they therefore impact overall social inequality.

Equality of opportunity?

Returning to questions of fairness, how should we think about whether unequal treatment by algorithms (or by society more generally) is unfair? Many people subscribe to some version of the idea of “equality of opportunity,” which suggests that inequality brought about by circumstances that one has no control over is unfair. For instance, children of poor parents should have the same chances as children of rich parents. (As documented by recent research, that is very much not the case. In particular in the US, children of richer parents tend to have much higher incomes themselves, relative to children of poorer parents.)

But would the absence of such strong intergenerational income inequality qualify as “equality of opportunity?” Should we not also take into account other factors already determined at the time of our birth? How are gender and race incorporated into this framework? Or geography? What about our parents’ education? We can continue building on this list, with each additional factor allowing us to predict a larger fraction of lifetime inequality. As we add more factors, equality of opportunity becomes an increasingly demanding concept.

Machines pushing our normative boundaries

Which brings us back to machine learning. If we take seriously the notion that we should not be held responsible for circumstances determined at the time of our birth, then we should consider any predictable inequalities of life outcomes to be violations of fairness. With the accumulation of data and new prediction methods, more and more becomes predictable.

But when everything becomes predictable, does it even make sense to use prediction to draw a line between “opportunity” on the one hand and “effort” or “deservingness” on the other? We are pushed by technology to question altogether this distinction, as well as notions of fairness restricted to equality across isolated dimensions (such as gender or parental income). We are pushed instead to confront the rising inequalities of outcomes in our society, no matter where they are coming from.

A utopia of learning machines

We have discussed some problems that might arise from targeting using machine learning, and from the emergence new inequalities. But is there a chance to leverage targeting and prediction for socially desirable ends?

Humans are different in innumerable ways—in our tastes and abilities, in our childhood experiences and our biological makeup, and so on. Treating everyone the same in a formal sense does not lead to equal possibilities in life. This is nicely summarized by a cartoon in which a monkey, a donkey and a whale are challenged to climb a tree. To take another obvious real-world example, administering the same medical treatments to everyone would not lead to particularly good health outcomes. Medical treatments need to be targeted to compensate for ailments that have been diagnosed.

Assume now that we desire to achieve equality along some dimension—let’s call this dimension “possibilities in life.” To maximize and equalize everyone’s possibilities in life, a utopian social policy might leverage machine learning based prediction to provide everyone with the public goods they need, whether in education, medicine, housing, monetary transfers, transport infrastructure, or some other area. Such a policy might achieve more in terms of equalizing possibilities in life than a one-size-fits all policy, forcing everybody to climb up the tree.

Side remark against the abuse of genetic research

This line of reasoning, incidentally, also shows the meaninglessness of the type of genetic research often cited by the biologistic right (think of the “Bell Curve”). The classic version of such research uses twin studies to come up with factoids such as “$x$% of IQ is genetically determined.” The more modern variant of such arguments uses sequenced gene data and machine learning methods to predict individual life outcomes.

Consider, as an example, eyesight. A twin study might plausibly find that genetically identical twins separated at birth have nearly identical eyesight. One might thus conclude that 99% of variation in eyesight is genetic. Now consider a hypothetical government agency distributing free eyeglasses (cheaply procured) of the right strength to everyone who needs them. The end result might be an improvement for everyone and complete equalization of (effective) eyesight. The point of the example is that the share of some outcome that is predictable from genes always depends on social institutions and political choices. The mechanisms that translate biological makeup or other individual conditions to socially relevant outcomes is always mediated by social institutions. Genetic predictability therefore tells us nothing about the scope for political action.

The same argument holds not just for twin studies, but also for predictability using sequenced gene data. The use of chopsticks, for instance, is highly predictable using sequenced genes. It would of course be ridiculous to conclude that chopstick use is genetically determined and unalterable by social institutions.

Who owns the data

To recap, a possible utopian use of machine learning and targeting is to use these methods to compensate for pre-existing inequalities and to provide everyone with the goods they need to maximize their possibilities in life. This contrasts sharply with targeting to maximize profits or votes.

A third option, beyond the use and ownership of big data by private corporations, or by hypothetical benevolent policymakers, would be to pose the question of property rights in data. As it stands now, none of us have any control or ownership or even knowledge about the data describing us, data that are accumulated constantly.

There is nothing natural or unavoidable about this situation. One might well conceive of a legal and technological framework where everybody retains ownership of any data that are specific to themselves. In such a framework, data about us might be stored on some secure third party platform, while remaining under our control. We might then decide on a case-by-case basis whether to pass on data about us to anyone requesting them—if we deem so to be beneficial for ourselves. This would require a legal framework where we retain sole rights over our data, but can “rent” them out, similar to the framework for wage labor. Our labor cannot be sold between other parties, but we can decide to work for an employer, while retaining the right to quit at any time. Contrast this with the current regime for data ownership, where we don’t have any possibility whatsoever to “quit.”

While it is not obvious that such a hypothetical framework would solve all problems and inequalities entailed by machine learning and targeting, it would certainly yield very different outcomes. It would be a worthwhile exercise to spell out the pre-conditions and possible consequence of such an alternative regime.

Title The Politics of Machine Learning, pt. II
Authors Maximilian Kasy
Date July 3rd, 2019
Collection Digital Ethics
Filed Under
 
 

Sign up for the JFI Letter, a weekly digest of compelling research across the social sciences.