3 Contributing Factors and 1 Box Ticked

What confidence do you have in the quality of important data captured manually by your business? Important data means the facts and figures on which your products or services are based, on which business decisions are made, on which Sales and Marketing depend on to impress prospects and customers, and so on.

I consider here the quality of data collection that is not 100% automated (using sensors, barcode readers, and the like), where the human mind and body is still part of the process. Yes — in spite of the ideals of the paperless organization and robotic reliability, manual data capture is still a major part of our world and a vital part in the success (or otherwise) of many businesses. Consider a process in which information is to be captured from a paper document or, more likely these days, a computerized image of a document… A common perception is that quality is assured through software controls at the point of capture, by an operator using a purpose-built data entry application. Data is keyed in, and it’s presumed that’s that, because the application keeps the human operator within bounds. And yes, software is certainly a factor contributing to data capture quality, but it is no guarantee of it. Here are 3 more factors that contribute to the quality of data manually captured from source documents, for you to think about in the context of your own operations.

1. Source data quality

Hand-writing, which still presents difficulties for automated recognition depending on style and context, can be an impediment to quality of captured data through sheer illegibility or mis-recognition of characters. Joined-up (cursive) handwriting, or idiosyncrasies of orthography in non-native languages (e.g. Dutch: ij versus y; a differently formed “8”; dashes that look like underscores; etc), are cases in point. Training is valuable here, so operators can get to grips with how characters are drawn and connected, and even practice reproducing the writing for themselves. There will always be an acceptable tolerance in captured data quality from hand-written source material, but data degradation can be mitigated against.

2. Business Rules

Representing the bridge between the input screen and the source document, business rules explain for each field what goes in and where it comes from. This is not always a simple correspondence, meaning that operators have decisions to make, which they can get wrong. Rules also explain what to do in exceptional circumstances, such as when required data is missing or incomplete, as well as when to regard a document as invalid for capture. It is one thing to say ‘do not capture incomplete documents’ but another to explain the precise meaning of ‘incomplete’. Failure to be precise leads to errors and inconsistencies. The quality of instructions has an impact on the quality of the captured data.

3. Double keying

This doubles the cost but increases accuracy twenty-fold. While an operator might make on average 1 character error (typo) in every 100, when a second operator keys the same data the chance of the typo occurring at the same character is 1 in 100. This allows character differences in the two keying passes to be spotted by a program and presented to a third operator who can decide — with reference to the source document — what the character should be, editing accordingly. Anomalies will prevent a theoretical 100-fold uplift, but it is a very effective process for dramatically increasing data quality where higher accuracy is important.

Fitness for Purpose

Each of these three factors contributes to the quality of captured data — through knowledge acquisition, clear instruction, and capitalizing on skilled keyboard operators working in tandem — in ways that data entry applications certainly support, but alone do not guarantee. In most cases, the imperative is that captured data be fit for purpose (for products, for decision-making, for Sales…) and the target quality is not always the same or even necessarily high. There is a trade-off between quality and the cost of achieving it. Critical is deciding what accuracy is fit for purpose and then designing a system to deliver it without incurring unnecessary cost. Having a target accuracy is the start of it, but how do you prove that the desired quality has been achieved? As with many processes, it’s done by sampling. ISO 2859-1:1999: Sampling Procedures For Inspection By Attributes, can be recommended for this. Proving, through sampling, that your captured data is fit for purpose ticks a critical box for your business. In my next blog on maintaining high quality in manual data capture, I will describe three more factors that impact results, and explore accuracy rates and sampling further.

Simon Bates