Source: Sam Charrington via Twitter
For the second year in a row, WrangleConf did not disappoint. The conversation picked up right where last year's left off: on the ethics of our craft. Last year the focus was on the humans building algorithms and the humans whom algorithms affect. This year, the discussion expanded in scope to consider the growing number people who interact with data science teams.
With an eye toward the increasing presence of data science in our daily lives, the speakers were more focused than ever on strategies to build and maintain trust: opening communication, recognizing bias, and, well, giving a damn.
To make algorithms effective, we need effective communication
We all have expectations. They're usually based on some form of data, even if they're not based on explicit data analysis. The problem with our expectations is that they often carry bias. As consumers (of data science), this bias reduces our trust in recommendations that don't jive with our priors. We're quick to dismiss the result as wrong.
To bring this issue to life, Moritz Sudhof of Kanjoya highlighted a number of biases inherent in employee performance management. For instance, managers typically remember only the most recent events or they seek to confirm things they “already know” about an employee.
Imagine a manager conducting a review for one of her employees, James. If other employees rate James in a way that doesn't square with the manager's view of his performance, it's going to be harder for her to trust the reviews. It's easy for the manager to brush them off by saying that the algorithm that produced the highlighted reviews is wrong. The in-product experience of recommendations can make or break their usefulness. As data scientists, we need to partner with product people to present algorithms effective:
"The best model is only as good as the user experience allows it to be" - Moritz Sudhof of @Kanjoya #WrangleConf pic.twitter.com/A9Mf8r5KGB
— Clare Corthell (@clarecorthell) July 28, 2016
It's not enough to just make recommendations. Kirstin Aschbacher of Jawbone illustrated how the language, timing, and focus of recommendations matters greatly.
Logging food helps with weight loss. Jawbone A/B test shows notifications increase logging food #WrangleConf pic.twitter.com/EpJOXEVqUz
— Alyssa Fu (@datasciencefu) July 28, 2016
Jawbone has many different ways to tell Jawbone UP users that they should eat better. It turns out that recommending healthy foods that people already like has a more positive effect on outcomes than discouraging people from eating unhealthy foods. The goal of both approaches is the same: in most cases, get people to eat fewer, better things. The difference—and much of the effectiveness—is in the positioning.
Both of these examples incorporate partnership with people outside the realm of data science. Model fit matters a lot less than real-world results. Throughout the day we heard from data scientists who are shipping increasingly useful algorithms when they've collaborated with people who are not traditionally involved in algorithm development.
Keep an eye on digital vulnerability
The scale of modern data products' use and impact can introduce severe consequences, especially for marginalized populations. Chris Diehl of The Data Guild calls this Digital Vulnerability: anyone may be victim of unwitting disclosure of personal information or of a biased prediction algorithm.
Open disclosure online is a privilege but isn't available to everyone. @ChrisDiehl #WrangleConf pic.twitter.com/OmiZposjsJ
— Alyssa Fu (@datasciencefu) July 28, 2016
We see exploitations of this vulnerability in the news all the time:
- Pokemon Go is difficult, if not impossible, to play in black neighborhoods.
- Waze's routing algorithm has turned quiet neighborhoods into bustling thoroughfares.
- LinkedIn emailed a New York school teacher's contacts, falsely identifying him as a white supremacist.
- Facebook's tweaked their feed algorithm to intentionally give some users more positive feeds than others.
Abe Gong of Aspire Health later drove this point home when he said that “Algorithms are de facto gatekeepers to opportunity.” COMPAS, a proprietary algorithm used to predict prison recidivism and inform parole, may be deeply unfair to blacks. This algorithm literally determines if someone can be released from prison.
Compare this to an algorithm that determines retargeting for ads. We can opt-out of disclosing our personal information to this algorithm. We have a choice. Those evaluated by COMPAS do not. As our personal data increasingly is used to determine if we'll be good homeowners, if we'll be healthy enough for low-cost insurance, or if we'll be successful employees, we need to monitor the bias of algorithms all the more carefully.
Handling the impact
Pete Skomoroch said it best: “If we don't figure out how to handle these things better, they will be handled for us in ways we don't like.”
Diehl suggests that we need “trusted implementations” that are open source and vetted by the widest community possible. Gong was a little more specific—he presented the idea of a data ethics review and challenged the audience to participate.
. @AbeGong’s call to action - perform an ethics review and tell world about it prior to deploying algos #WrangleConf pic.twitter.com/WOsNMyqFpr
— Chris Diehl (@ChrisDiehl) July 28, 2016
Both of them created public documents to keep the discussion ongoing:
Josh Wills of Slack pointed to Chuck Klosterman's I Wear the Black Hat as traits to keep an eye on: “The villain is the person who knows the most but cares the least.” In data science, this is the person or company who has the most data, and doesn't care about how its use impacts people. And there are people who don't care—the most indifferent person in the industry will always set the standard. It's up to us as data scientists not to be that person.
Looking at the #WrangleConf Twitter stream, it's clear that this year's ethics discussions fueled a lot of excitement. Don't let that fire go out. As data scientists, let's challenge ourselves to second-guess assumptions, identify and eliminate biases, and, above all, keep the conversation going.