This week I began working on transcribing documents using Google cloud drop. This Google app has text recognition software that’s supposed to be more advanced than Adobe Acrobat, and will hopefully make our transcriptions go by faster. While it can’t recognize anything that’s handwritten (at least in cursive), it does very well at recognizing the text of several cablegrams and typewritten documents we have.
It may seem pointless to be transcribing documents that are written in print. The majority are clearly legible, and often rather short, so why transcribe them? While we may be able to read them with ease, the vast majority of computers and search engines can’t because the text is not in a format they can understand.
Let’s say you’re trying to find all documents that contain the phrase “safe for democracy”. If you type this in the search engine you will get varied results, and probably not many documents containing that phrase, unless they’ve been transcribed. This is because computers use the language of 1s and 0s. Everything you see on a computer screen is actually some combination of those two numbers, which tell the computer what to put up on your screen. Every letter in this post is also a unique combination that allows the computer to present you with a language you understand.
But most computers can’t understand words written on a scanned page, because that image does not have the underlying code that translates it for the computer. This is where transcription comes in. Transcribing is as much for the computer’s readability as yours, and by typing out the contents of a letter into a computer, the letter then becomes text searchable.
So if computers can’t read words the same way we do, then how can Google cloud drop do it? The answer is through API, or Application Programming Interface. This is a learning software that helps develop apps, and can teach computers how to recognize things not written in computer code. A programmer creates a code that explicitly tells the computer what certain letters look like, how to recognize letters that are grouped together to form words, and even some specific words. The computer is then put to work.
However, the computer only has the basics and can’t always recognize all words. Sometimes it will still mistake certain letters for others, especially if they’re smudged or slightly too close together, causing some pretty wacky transcriptions. For example, if I see the typo “Ihope to see you againsoon”, I know that “I” and “hope”, and “again” and “soon” are meant to be seperated, but a computer will think it’s one word, or blend a few letters together like this: “agaiÑoon”.
But because it’s a learning software, you can actually correct its errors and teach it what the correct thing is, and the computer will remember that for next time. If you repeat this process enough times the computer will eventually be able to do it without help. This learning curve is what keeps many website builders from using it, except as a seperate application that won’t affect their own platform.
At the WWPL, our cloud drop is still in need of lot’s of editing, and can only handle documents under 25 KB. But it’s still saved me a bit of time on transcribing. Instead of spending 30 mins on a document, I’m spending about 15 just editing. So that’s an improvement.