The Small Data Revolution
July 02, 2013 by Jennifer Cobb
Big data is all the rage these days. And for good reason. Great insights will be gleaned by analyzing the vast stores of information accumulating around the world. As Eric Schmidt recently noted, every two days, we create as much information as we did from the dawn of civilization through 2003. This volume is poised to rise even faster as we attach sensors to everything from roads to clothing in the vast Internet of Things.
But the capacity to collect, analyze and interpret the stores of data marked by significant volume, variety and velocity – the “three v’s” of big data – is currently limited to those organizations with serious resources. Google, Facebook and Apple have the capacity and the talent to reap the rewards of big data. Other big businesses, governments and NGOs are quickly catching up.
But what about the rest of us who understand that data-driven insights are critical to good strategic and financial outcomes? Are we just out in the cold?
No. As powerful as big data is, vast amounts of intelligence and insight also lurk in what we call small data, defined as “everything processed in Excel” by Rufus Pollack, Founder of the Open Knowledge Foundation. In fact, small data provides even better and more accurate information in many cases.
Pollack, recently summed up our point of view in the Guardian when he wrote, “the real opportunity is not big data, but small data. Not ‘one ring to rule them all’ but ‘small pieces loosely joined’.”
Most businesses and organizations have a lot of “small pieces” of data somewhere in their workflow. Often, this small data is very high value. It is information that people have filled out and sent to you, ranging from feedback surveys to invoices and receipts. This is not inferred or third hand data, this is direct communication. In data parlance, most of this small data is very high signal and low noise, the gold standard for accuracy and excellence.
But lots of small data is hard to gather. In many cases, it is in a form that makes it hard to translate into the machine-readable formats we need to perform effective analysis. It comes in as handwriting on paper, faxed order forms and invoices, or static PDFs. All too often, businesses and organizations end up performing manual data entry to get this high-quality information into spreadsheets, databases and CRM systems. Slow, expensive and not very accurate.
As Pollack points out, the ability to collect, aggregate and collaborate across small data sets is the real revolution. He writes that the real data revolution “isn’t about large organizations running parallel software on tens of thousands of servers, but about more people than ever being able to collaborate effectively around a distributed ecosystem of information, an ecosystem of small data.”
Captricity specializes in small data. Our crowd-guided machine learning and computer vision platform that turns paper, fax and PDF forms into actionable data in hours, with 99% accuracy. We handle the hardest small data documents with ease, including handwriting on paper and human marks of all types. To learn more about our products, click here. Or contact us and we will be happy to help ignite your small data revolution.