As dean of Georgetown’s recently launched McCourt School of Public Policy, I believe there is no better time than now to be a public policy scholar, educator or student.
The last several years have brought with them a deluge of data from sensors, satellites, cell phones, and digitized text that was unimaginable even 20 years ago. With it, the data brings amazing opportunities for public policy research on some of the world’s most pressing challenges.
The annual global storage of new digital data is now roughly equivalent to the creation of a new Library of Congress, which holds more than 150 million books and other material in its collections – 60,000 times a year.
And McKinsey Global estimates that digital data will grow by 40 percent a year through 2020.
This phenomenal amount of new data, combined with the advances in computer technology, has created the conditions to formulate smarter, more personalized public policy, much in the same way the study of DNA spawned personalized medicine.
Where is all this new data coming from?
From 3050 B.C., when the first census was taken in ancient Egypt, up until the mid-1990s, when widespread commercial and private use of the Internet began, new data came mainly from specially designed government or business surveys.
Now the biggest source of new information, often called “data exhaust,” is produced by millions of people going about their daily lives posting information about their preferences, attitudes and interests online.
New data is also rising as a byproduct of the continual collection of administrative data from manufacturing, service provision and financial transactions in both the private and public sectors. Other original data is coming from digitized texts from libraries and newspapers. And new satellites with expanded capabilities are now producing huge amounts of real-time geospatial data on everything from climate change and outbreaks of severe weather to the movement of immigrants and changes in commuting patterns.
Scratching the Surface
While each of these massive data sets has value, the real advance in understanding will arise when these sets are merged to produce new combinations of attributes. This process allows us to ask and answer questions not contemplated when the data were first created.
The public sector is just now scratching the surface in its exploration of the possibilities presented by these new massive data sets. The Centers for Disease Control and Prevention, for example, use data mining of online health-related searches to predict infectious disease outbreaks. The International Food Policy Research Institute measures excess food price spikes in local markets to expedite country-level food security responses. NOAA collects 1.5 billion observations from sensors around the globe each day to produce millions of weather reports. And FEMA monitors Twitter to assess tornado damage.
These are only a few examples of the potential for massive data to save lives, improve the environment, and protect society. The scope for advances in the use of predictive analytics in improving operational effectiveness, resource allocation, and the early warning problem detection is nearly limitless.
Driving the Car
Picture for a moment matching employee and employer data on job and firm characteristics, then cross-linking it with detailed information about workers’ schooling and job characteristics of the local area and even health outcomes. Combining data sets like these could lead to a richer understanding of employee and employer dynamics, resulting in more relevant schooling and training, higher productivity, and better-designed government policies to spur growth and reduce unemployment.
A better understanding of what makes workers and companies prosper would be extremely valuable information.
There is a challenge however in the private sector posed by the shortage of people skilled in data analytics, but the problem in the public sector is orders of magnitude greater. Without people skilled in how to use and interpret this data, the Open Government movement will result in a beautiful car that no one can drive.
As long as these two sectors can work together to interpret data correctly and ethically, incredible advances in public policy are possible by creating a bridge between data scientists and policy government practitioners and nonprofits.
The McCourt School of Public Policy at Georgetown University, with its strength in quantitative analysis, is in a strong position to build these bridges. The school is home to a focused multidisciplinary faculty that regularly conducts evidence-based research on new and enduring public policy challenges.
Many of our faculty members also move back and forth from high-level government positions, thus informing the classroom and their research with hands-on experience.
A Bright Future
Our Massive Data Institute will support research and teaching that focuses on linking and integrating next-generation data in ways that deepen our understanding of society and human behavior. The goal of this work is not only to improve public policy, but also to one day personalize the data to specific individuals.
This new frontier – blossoming only 40 years after the creation of the microchip – is an incredible opportunity for public policy scholarship and research.
The McCourt School Massive Data Institute will help bridge the gap between data scientists and policy makers, formulate the right policy questions, and work to find solutions to important societal problems. I feel privileged to be a part of this exciting time for the future of public policy.
(Top image: Courtesy of Thinkstock)
Edward Montgomery is the Dean of the McCourt School of Public Policy at Georgetown University and will be participating as panelist during today’s Data Innovation Day. Ideas Lab is a media partner for the event and Ideas Lab Editor Brock N. Meeks will be moderating a panel on the Data Economy.