BY HANNAH MASUGA
Data-driven policymaking is widely touted as the best way to improve government, but it also poses a threat to our fundamental freedoms. It’s true that research intended to drive more efficient and effective programming provides important insights into how society functions. The danger comes from leveraging technology to implement our findings. This automation of services can lock in our current understanding of human behavior and limit individual liberty. As we bring digital and algorithmic solutions to policy, the role of human judgment will become more, not less, important in assessing the fairness and justice of these tools.
Algorithmic tools use data and advanced computational techniques to augment or automate human judgment. These techniques have created massive efficiencies across sectors, driving widespread adoption. The McKinsey Global Institute projects that further automation could increase productivity growth by 0.8–1.4 percent globally. However, scholars have increasingly raised concerns over algorithms’ negative effects. Recently, WIRED declared 2017 “the year we fell out of love with algorithms,” citing the destructive outcomes of Facebook’s algorithmically driven News Feed on the American election.
This conflict appears most acute when deploying algorithms to aid in the traditional functions of government. For example, Boston Public Schools recently held a competition to develop an algorithm aimed at optimizing a seemingly straightforward logistical problem: transportation schedules and school start times. The resulting recommendation would save the school system an estimated $5 million but generated an outcry among parents whose children would start school as early as 7:15 a.m. Policy makers and administrators arguably lost sight of constituents’ concerns in pursuit of algorithmic efficiency.
While many assume math is an objective abstraction, the application of algorithmic systems to solve policy issues is inherently political. An increasing number of investigative reports have demonstrated that algorithms are not free from existing biases. Without the proper framing and application of their subject matter expertise to these tools, policy makers run the risk of reinforcing and obscuring discrimination under the assumption that math and associated computation are inherently objective exercises.
These biases are often embedded in services, functions, and technology contexts themselves. StreetBump, a mobile application that automatically reports potholes to cities based on motion-sensor data, leverages algorithms to guide its alerts and interventions. Though it is seemingly neutral in creating data and providing recommendations, the context around the resources necessary for the app (users require both a car and cell phone with a data plan) suggests that those using the platform will need to be relatively resourced, potentially reinforcing disparate spending on city services in wealthier neighborhoods.
Government officials have demonstrated significant interest in moving to data-driven decision-making but have not yet expressed a pairwise understanding of the opportunities and risks that the widespread adoption of algorithms poses. Rather than attempt to catch up to the private sector or avoid algorithms altogether, policy makers have a unique opportunity to lead the conversation on fairness and transparency. In late 2017, the New York City Council established the first oversight body to review the ways in which algorithms were being used by city agencies. This is an excellent first step toward bringing accountability to the automated deployment of city services. Additionally, by leveraging a standardized framework to analyze the context and application of particular algorithms, policy makers can make more educated decisions around the potential benefits and risks the technologies create.
Policy makers should create simple heuristics, including identifying the origin and content of the data used to develop the algorithm. Critical considerations include knowing how the data were collected, what data elements are included, and whether the algorithm has been “enriched” through mergers with additional outside data.
Policy makers must ask the critical question of what study generated the data. In Cathy O’Neil’s Weapons of Math Destruction, a UK medical school developed a test for assessing candidates based on the performance of previous students. However, as time passed, it was discovered that during the hiring period used to develop the algorithm, the staff discriminated against female candidates. The test outcomes subsequently reflected the bias inherent in the data set.
In cases where the model is built off many variables and the relationships between the data are unclear, policy makers and their staffers should dig deeper before deploying the model to make real-life decisions. For example, an investigation into Chicago’s Strategic Subject List, an algorithmic model used to predict potential criminal offenders, uncovered that a major contributing factor to the resulting risk score was having been a victim of assault or gun violence, more so than being arrested for violent crime or gang affiliation.
This kind of insight not only serves to better allocate services but also has the potential to highlight an important reality in the broader policy conversation on crime. The perceived complexity of algorithms often obscures important details. Policy makers can and should step into this gap to inform and help guide constituents and experts.
Policy makers must additionally develop processes for evaluating the quality and fairness of algorithmic decisions. Most algorithms do not have mechanisms for interpretation and auditing their output. As a result, there is little recourse for individuals who are subject to adverse decisions. Consumer credit scoring in the United States provides some guidance. Not only can individuals receive an explanation of the factors leading to their score, but the explanatory output gives the individual a course of action to improve their score. Without the opportunity to understand or challenge automated decisions, we are creating an automated tyranny.
Algorithm developers have begun to establish technical recommendations aimed at creating baselines for accountability. The organization Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) recommends a number of practical steps to proactively address output concerns, including communicating uncertainty, sharing explanations in output, and allowing subjects to challenge determinations. Short of those embedded features, nontechnical policy makers should seek to evaluate an algorithm’s output more broadly at both the aggregate and individual levels.
Lawmakers deploying algorithmic systems should request full exports or samples of the processed data where possible. At the very least, these samples can both aid in the evaluation of the impact of the algorithms and create additional insights into long-standing policy considerations. This was the hope for the Boston-Uber data-sharing agreement, which was a partnership aimed at aiding public transit planning by assessing Uber ride patterns. A growing number of cities are recognizing that ride-sharing networks can offer important insights to transit policymaking. In the absence of data from the major ridesharing companies, many cities have begun utilizing ancillary information, such as vehicle location data, to measure the impact.
At the individual level, policy makers’ considerations and decision rules should focus on justification and the information available to the subject. Algorithms are often used because they are less expensive for governments; more troubling, they are often brought first to issues that affect marginalized or less powerful groups. Emerging best practices suggest that clearly demonstrating to subjects why certain decisions were made is an important component of applying algorithms fairly and empowering those who are adversely affected to file objections.
To check algorithms’ treatment of individual cases, Cathy O’Neil recommends randomly selecting records for manual review and performing a qualitative assessment of an algorithm’s decision-making, an exercise that ensures human judgment still supervises computer algorithms. Manual review of individual records is one of the most powerful tactics in a rapidly emerging field, as human intelligence in this area often outshines that of computers. There may come a time when artificially intelligent systems can replicate all human faculties, from empathy to engineering, but in the immediate future, algorithms must be augmented by human intelligence and ethical judgment.
In summary, public concerns should not stop policy makers from seeking to benefit from the efficiencies of algorithms. But they should take special care to incorporate their values, check algorithms’ outputs, and provide easy-to-understand rationales for their use. Without these elements, policy makers run the risk of further obscuring their decision-making behind tools that are opaque to their constituents. The legitimacy of government institutions and services is at stake.
Hannah Masuga is a first-year master in public policy candidate at the John F. Kennedy School of Government at Harvard University. Previously she worked in the software industry where she led implementation and analytics teams serving clients in health care and government. Hannah is interested in digital innovation in the public sector, the platform economy, and algorithmic transparency.
 “What’s now and next in analytics, AI, and automation” McKinsey & Company, Mary 2017, accessed 9 February 2018, https://www.mckinsey.com/global-themes/digital-disruption/whats-now-and-next-in-analytics-ai-and-automation.
 Tom Simonite, “2017 Was The Year We Fell Out of Love with Algorithms,” WIRED, 26 December 2017, https://www.wired.com/story/2017-was-the-year-we-fell-out-of-love-with-algorithms/.
 Kade Crockford and Joi Ito, “Don’t blame the algorithm for doing what Boston school officials asked,” The Boston Globe, 22 December 2017, https://www.bostonglobe.com/opinion/2017/12/22/don-blame-algorithm-for-doing-what-boston-school-officials-asked/lAsWv1Rfwqmq6Jfm5ypLmJ/story.html.
 “Digital Decisions,” Center for Democracy & Technology,” accessed 22 November 2017, https://cdt.org/issue/privacy-data/digital-decisions/.
 Lauren Kirchner, “New York City Moves to Create Accountability For Algorithms” ProPublica, 18 December 2017, https://www.propublica.org/article/new-york-city-moves-to-create-accountability-for-algorithms.
 Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, First edition (New York: Crown, 2016).
 Jeff Asher and Rob Arthur, “Inside the Algorithm That Tries to Predict Gun Violence in Chicago,” The New York Times, 13 June 2017, https://www.nytimes.com/2017/06/13/upshot/what-an-algorithm-reveals-about-life-on-chicagos-high-risk-list.html.
 Matthew Sheret, “Making it clear when machines make decisions,” Writing by IF (blog), Medium, 20 April 2017, https://medium.com/writing-by-if/making-it-clear-when-machines-make-decisions-a991115b7697.
 Nicole Dungca, “In first, Uber to share ridership data with Boston,” The Boston Globe, 13 January 2015, https://www.bostonglobe.com/business/2015/01/13/uber-share-ridership-data-with-boston/4Klo40KZREtQ7jkoaZjoNN/story.html.
 Laura Bliss, “To Measure the ‘Uber Effect’ Cities Get Creative,’” CityLab, 12 January 2018, https://www.citylab.com/transportation/2018/01/to-measure-the-uber-effect-cities-get-creative/550295/.
 Sheret, “Making it clear when machines make decisions.”
 Gideon Mann and Cathy O’Neil, “Hiring Algorithms Are Not Neutral,” Harvard Business Review, 9 December 2016, https://hbr.org/2016/12/hiring-algorithms-are-not-neutral.