Executive Order on Open Data Raises the Bar for Data Access
May 21, 2013 by Jennifer Cobb
On May 9th, President Obama signed an executive order that makes “open and machine readable” data the default standard for all federal government information. As the President commented on signing the order, this initiative is going “to help more entrepreneurs come up with products and services that we haven’t even imagined yet. This kind of innovation and ingenuity has the potential to transform the way we do almost everything.”
The upside for innovation is clear. With vast amounts of government data available in machine-readable formats, new mashups are likely to emerge at a rapid clip. And these innovations are likely to create jobs and social value. As Alex Howard wrote in Slate, “This is perhaps the biggest step forward to date in making government data—that information your tax dollars pay for—accessible for citizens, entrepreneurs, politicians, and others.”
Raising the Bar
It is indeed a huge step forward on a path that began in 2009 when the US government mandated that every US agency provide at least three “high-value data sets” through its portal, Data.gov. As helpful as this was directive was, it was clearly just the tip of the iceberg. A significant portion of the data that was made available was locked in formats not ready for ready integration into IT systems, solutions and apps. This order means to change that.
To raise the bar even further, the administration is not letting agencies off the hook for legacy data. A part of the order is a mandate to all agencies to create a public inventory of their available data and prioritize its release. As the Sunlight Foundation's John Wonderlich wrote, "Agency-wide comprehensive audits of datasets is a big and aggressive move."
An Information-Centric Approach
As the President’s Digital Government Initiative states, the mandate is to move from “managing ‘documents’ to managing discrete pieces of open data and content which can be tagged, shared, secured, mashed up and presented in the way that is most useful for the consumer of that information.” The move away from “documents” to “information” as the primary currency of disclosure offers significant challenges.
Most agencies work with a patchwork of legacy systems that were not architected to make data securely available via APIs or portable formats outside the boundaries of their own silos. Budget to extract this data and adapt these systems is scarce, particularly under the constraints of the sequester. This is one likely reason that the administration was careful to apply the order specifically to all new IT systems. However, even new IT systems often pull data from processes that were not designed to support open, machine-readable outputs. Many valuable datasets are still collected via paper and stored in boxes, waiting to be input. Other datasets are housed in PDF or scanned documents that make information difficult to extract. This “first mile” problem is often one of the hardest to solve and one that receives little attention among developers more interested in working with data once it is a digital format.
At Captricity, we have a platform that can help with this piece of the puzzle. The ability to efficiently get data from file cabinets or file servers into a machine-readable format is a huge hurdle in complying with the new order. Our solution not only captures data from static forms, but our unique blend of machine learning and human intelligence ensures that the data is extracted in hours and with a 99% accuracy rate. And our unique approach to privacy ensures the protection of sensitive data.
Since participating in the Code for America Accelerator Program, we’ve worked with government agencies and institutions to help them overcome exactly this challenge.
This executive order spells a sea-change for how the government manages its data. And we all stand to benefit. We also understand the path to get there is a long, hard one, particularly in times of cut-backs and the sequester. We feel confident we can help solve some big parts of the problem and look forward to working with you all toward an efficient, cost-effective solution. To learn more about how we can help, please contact us.