↳ Statistics

September 9th, 2019

↳ Statistics

Original & Forgery


The difficulties of causal reasoning and race

While the thorny ethical questions dogging the development and implementation of algorithmic decision systems touch on all manner of social phenomena, arguably the most widely discussed is that of racial discrimination. The watershed moment for the algorithmic ethics conversation was ProPublica's 2016 article on the COMPAS risk-scoring algorithm, and a huge number of ensuing papers in computer science, law, and related disciplines attempt to grapple with the question of algorithmic fairness by thinking through the role of race and discrimination in decision systems.

In a paper from earlier this year, ISSA KOHLER-HAUSMAN of Yale Law School examines the way that race and racial discrimination are conceived of in law and the social sciences. Challenging the premises of an array of research across disciplines, Kolher-Hausmann argues for both a reassessment of the basis of reasoning about discrimination, and a new approach grounded in a social constructivist view of race.

From the paper:

"This Article argues that animating the most common approaches to detecting discrimination in both law and social science is a model of discrimination that is, well, wrong. I term this model the 'counterfactual causal model' of race discrimination. Discrimination, on this account, is detected by measuring the 'treatment effect of race,' where treatment is conceptualized as manipulating the raced status of otherwise identical units (e.g., a person, a neighborhood, a school). Discrimination is present when an adverse outcome occurs in the world in which a unit is 'treated' by being raced—for example, black—and not in the world in which the otherwise identical unit is 'treated' by being, for example, raced white. The counterfactual model has the allure of precision and the security of seemingly obvious divisions or natural facts.

Currently, many courts, experts, and commentators approach detecting discrimination as an exercise measuring the counterfactual causal effect of race-qua-treatment, looking for complex methods to strip away confounding variables to get at a solid state of race and race alone. But what we are arguing about when we argue about whether or not statistical evidence provides proof of discrimination is precisely what we mean by the concept DISCRIMINATION."

Link to the article. And stay tuned for a forthcoming post on the Phenomenal World by JFI fellow Lily Hu that grapples with these themes.

  • For an example of the logic Kohler-Hausmann is writing against, see Edmund S. Phelps' 1972 paper "The Statistical Theory of Racism and Sexism." Link.
  • A recent paper deals with the issue of causal reasoning in an epidemiological study: "If causation must be defined by intervention, and interventions on race and the whole of SeS are vague or impractical, how is one to frame discussions of causation as they relate to this and other vital issues?" Link.
  • From Kohler-Hausmann's footnotes, two excellent works informing her approach: first, the canonical book Racecraft by Karen Fields and Barbara Fields; second, a 2000 article by Tukufu Zuberi, "Decracializing Social Statistics: Problems in the Quantification of Race." Link to the first, link to the second.
⤷ Full Article

August 23rd, 2019

Is it impossible to be fair?

Statistical prediction is increasingly pervasive in our lives. Can it be fair?

The Allegheny Family Screening Tool is a computer program that predicts whether a child will later have to be placed into foster care. It's been used in Allegheny County, Pennsylvania, since August 2016. When a child is referred to the county as at risk of abuse or neglect, the program analyzes administrative records and then outputs a score from 1 to 20, where a higher score represents a higher risk that the child will later have to be placed into foster care. Child welfare workers use the score to help them decide whether to investigate a case further.

Travel search engines like Kayak or Google Flights predict whether a flight will go up or down in price. Farecast, which launched in 2004 and was acquired by Microsoft a few years later, was the first to offer such a service. When you look up a flight, these search engines analyze price records and then predict whether the flight's price will go up or down over some time interval, perhaps along with a measure of confidence in the prediction. People use the predictions to help them decide when to buy a ticket.

⤷ Full Article

June 17th, 2019

Insulators (Magritte machine)


The political history of economic statistics

Debates over the relevance of indicators like GDP for assessing the health of domestic economies are persistent and growing. Critics of such measures point to the failures of such measures to holistically capture societal wellbeing, and argue in favor of alternative metrics and the disaggregation of GDP data. These debates reflect the politics behind the economic knowledge that shapes popular understanding and policy debates alike.

In his 2001 book Statistics and the German State, historian Adam Tooze examined the history of statistical knowledge production in Germany, covering the period from the turn-of-the-century to the end of the Nazi regime, "driven by the desire to understand how this peculiar structure of economic knowledge came into existence… and the relationship between efforts to govern the economy and efforts to make the economy intelligible through systematic quantification."

From the book's conclusion:

"We need to broaden our analysis of the forces bearing on the development of modern economic knowledge. This book has sought to portray the construction of a modern system of economic statistics as a complex and contested process of social engineering. This certainly involved the mobilization of economists and policy-makers, but it also required the creation of a substantial technical infrastructure. The processing of data depended on the concerted mobilization of thousands of staff. In this sense the history of modern economic knowledge should be seen as an integral part of the history of the modern state apparatus and more generally of modern bureaucratic organizations… The development of new forms of economic knowledge can therefore be understood as part of the emergence of modern economic government and as a sensitive indicator of the relationship between state and civil society."

Link to the book preview, link to the book page on Tooze's website.

  • For a more generalized account of the political history of statistical knowledge (inclusive of economic statistics), see the The Politics of Large Numbers by Alain Desrosières. Link. Another excellent item in the history of statistical knowledge: A History of the Modern Fact, on the advent and impact of double-entry bookkeeping. Link.
  • In the Winter 2019 issue of the Journal of Economic Perspectives, Hugh Rockoff examines the political history of American economic statistics, and tracks the emergence and institutionalization of measures of "prices, national income and product, and unemployment." Link.
  • Previously shared here, research by Aaron Benanev examines the institutional history linking the concept of "informality" and unemployment metrics developed by the International Labor Organization. Link to his paper.
  • A recent paper by Andrea Mennicken and Wendy Nelson Espeland surveys the quantification literature. Link. And a (previously shared) panel discussion on the historiography of quantification. Link.
⤷ Full Article