In Minority Report, Steven Spielberg’s futuristic movie set in 2050 Washington, D.C., three sibling “pre-cogs” are hooked up with wires and stored in a strange looking kiddie pool to predict the occurrence of criminal acts. The “Pre-Crime” unit of the local police, led by John Anderton (played by Tom Cruise), uses their predictions to arrest people before they commit the crimes, even if the person had no clue at the time that he or she was going to commit the crime. Things go a bit awry for Anderton when the pre-cogs predict he will commit murder. Of course, this prediction has been manipulated by Anderton’s mentor and boss to cover up his own past commission of murder, but the plot takes lots of unexpected twists to get us to that revelation. It’s quite a thriller, and the sci-fi element of the movie is really quite good, but there are deeper themes of free will and Big Government at play: if I don’t have any intent now to commit a crime next week, but the pre-cogs say the future will play out so that I do, does it make sense to arrest me now? Why not just tell me to change my path, or would that really change my path? Maybe taking me off the street for a week to prevent the crime is not such a bad idea, but convicting me of the crime seems a little tough, particularly given that I won’t commit it after all. Anyway, you get the picture.
As we don’t have pre-cogs to do our prediction for us, the goal of preventive government–a government that intervenes before a policy problem arises rather than in reaction to the emergence of a problem–has to rely on other prediction methods. One prediction method that is all the rage these days in a wide variety of applications involves using computers to unleash algorithms on huge, high-dimensional datasets (a/k/a/ Big Data) to pick up social, financial, and other trends.
In Predictive Regulation, Sullivan & Cromwell lawyer and recent Yale Law School grad Joshua Mitts lays out a fascinating case for using this prediction method in regulatory policy contexts, specifically the financial regulation domain. I cannot do the paper justice in this blog post, but his basic thesis is that a regulatory agency can use real-time computer assisted text analysis of large cultural publication datasets to spot social and other trends relevant to the agency’s mission, assess whether its current regulatory regime adequately accounts for the effects of the trend were it to play out as predicted, and adjust the regulations to prevent the predicted ill effects (or reinforce or take advantage of the good effects, one would think as well).
To demonstrate how an agency would do this and why it might be a good idea at least to do the text analysis, Mitts examined the Google Ngram text corpus for 2005-06, which consists of a word frequency database of the texts of a lot of books (it would take a person 80 years to read just the words from books published in 2000) for two-word phrases (bi-grams) relevant to the financial meltdown–phrases like “subprime lending,” “default swap,” “automated underwriting,” and “flipping property”–words that make us cringe today. He found that these phrases were spiking dramatically in the Ngram database for 2005-06 and reaching very high volumes, suggesting the presence of a social trend. At the same time, however, the Fed was stating that a housing bubble was unlikely because speculative flipping is difficult in homeowner dominated selling markets and blah blah blah. We know how that all turned out. Mitts’ point is that had the Fed been conducting the kind of text analysis he conducted ex post, they might have seen the world a different way.
Mitts is very careful not to overreach or overclaim in his work. It’s a well designed and executed case study with all caveats and qualifications clearly spelled out. But it is a stunningly good example of how text analysis could be useful to government policy development. Indeed, Mitts reports that he is developing what he calls a “forward-facing, dynamic” Real-Time Regulation system that scours readily available digital cultural publication sources (newspapers, blogs, social media, etc.) and posts trending summaries on a website. At the same time, the system also will scour regulatory agency publications for the FDIC, Fed, and SEC and post similar trending summaries. Divergence between the two is, of course, what he’s suggesting agencies look for and evaluate in terms of the need to intervene preventively.
For anyone interested in the future of legal computation as a policy tool, I highly recommend this paper–it walks the reader clearly through the methodology, findings, and conclusions, and sparks what in my mind if a truly intriguing set of policy question. There are numerous normative and practical questions raised by Mitts’ proposal not addressed in the paper, such as whether agencies could act fast enough under slow-going APA rulemaking processes, whether agencies conducting their own trend spotting must make their findings public, who decides which trends are “good” and “bad,” appropriate trending metrics, and the proportionality between trend behavior and government response, to name a few. While these don’t reach quite the level of profoundness evident in Minority Report, this is just the beginning of the era of legal computation. Who knows, maybe one day we will have pre-cogs, in the form of servers wired together and stored in pools of cooling oil.