We know that some of the documents we process contain sensitive and/or confidential information. We are deeply familiar and with follow the privacy regulations of IRB, HIPAA, and other governing bodies.  We do everything we can to help our customers conform to these regulations. To this end, Captricity is built with state-of-the-art security fully integrated into every step of the data digitization process, and is 100% HIPAA compliant.

Shredded data verification

Shreddr, the technology that powers Captricity, is named after the well-known document-shredding software used across industries to protect confidential data. Shreddr technology works by isolating pieces of information, or data fields, within a form into distinct images.

Each field, or “shred,” is then read and digitized out of context from the rest of the form by one of thousands of data entry workers spread across the globe (the majority of which come from Amazon’s Mechanical Turk).  The data entry and review process is designed so that each worker is assigned to process a given class or type of data, such as Lastname, from many forms rather than a group of complete forms. This ensures that every worker sees only one piece of data, or shred, from a single form.

Shredded work

In the diagram above, Worker 1 processes only Lastname fields from hundreds of forms without ever seeing ID number, City, Age, or any other fields from a single form. Moreover, workers process data in a “blind” fashion, meaning that they are not informed which class or type of data they have been assigned. In the example above, this means that Worker 1 does not know they are processing Lastname fields, Worker 2 does not know they are processing ID numbers, and so on.

In this way, workers focus only on accurately transcribing the specific data contained in their assigned fields, or shreds, without need for or knowledge of additional context.

An additional note about Shreddr technology and security: Each shred is protected by a special algorithm so that even if someone managed to gather a large collection of shreds, it would be virtually impossible to reconstruct the original form⎼a feature even paper shredders aren’t able to claim!


In some cases, such as with social security numbers, personally identifiable information (PII) may be exposed even using shredding technology. There are two solutions:

  1. If you do not need this information to be digitized, you can redact or black out the field using Advanced features in the defining fields step. This information will not be seen by anyone, and will also not be included in your results.

  1. If you do need this information to be digitized, you can break the PII-containing field into separate fields. For example, you can break a social security number into three separate fields so that each field will be read and digitized by a different worker.
Social security number broken into three fields

Privacy-certified workers

Another option for processing sensitive, or PII-containing fields, is to route the work through privacy-certified workers or your own staff. Contact us to pursue or learn more about this option.

Stay up to date!

Learn how data access is driving the big data revolution and business strategy at the world’s leading organizations.

Sign up to receive our updates on the newest technology and trends in big data and analytics.

Your privacy is important to us—we will never sell or share your email address!