Channeling Grammy: Adventures in Indexing #3

I’ve been helping with the indexing of the 1940 US Census for 11 days now and have completed about 20 batches. I decided it was time to check my arbitration results to see how accurate I was.

I was pleased to see that almost all of my batches were in the 98-100% accuracy range. Well, there are so many fields to transcribe that 100% doesn’t really mean zero errors; it just means that less than half a percent of the transcribed fields were judged to be in error.

However, there was one batch that was low at 91%. I wondered why that one was so much lower than the others. What did I find? One error about 25 times. “W” instead of “White” in the Race field.

If you’ve been indexing, you know that the software is kind enough to auto-fill fields. For instance, once you’ve typed “Texas” in the Place of Birth field one time, you can just type “T” and tab out of it on later rows and it will automatically fill in the rest of the word. However, if you have entered more than one value that starts with “T” (Tennessee, for instance), it won’t auto-fill until you have typed enough letters for the system to know which one you want (Tex or Ten).

As best as I can guess, on one line I must have typed a space after the W in the Race field, so it didn’t fill in the rest. Then on subsequent lines, when I typed W it wasn’t unique. When I tabbed to the next field, it didn’t auto-fill. Unfortunately, I didn’t notice the mistake and it continued through the rest of the page. Maybe I didn’t notice it because some of the other fields (Sex, Marital Status) are entered as only the starting letter.

What’s interesting to me is that the Quality Checker didn’t catch my mistakes. I’d love to see an upgrade to the indexing software that would include the Race field in the Quality Checker!

Friday, April 13, 2012

Adventures in Indexing #3

No comments:

Post a Comment