Week 11 – Even More Exhibit Prep

This week I completed the transcriptions from the Ochs collection that I may use in the exhibit, which was helpful review of relevant information that I can include in the exhibit. The Ochs collection is so large that there were many important pieces of his career that I had missed or forgotten. I was able to fill in the timeline a little and do some brainstorming for the exhibit outline as well.

I hope to present Mark with this outline next week and see what advice he may have for me.

Week 10 – More Exhibit Prep

This was a short week, since I had to miss internship for a work-related event on Tuesday, and Wednesday we got 10 inches of snow. To make up for the hours I missed, I spent some time at home transcribing letters from the Ochs collection that I think will be relevant to the online exhibit.

Once I was able to get back to internship, I visited the soldiers’ lives exhibit at the museum to get inspiration.

This exhibit focuses on three soldiers and a nurse and includes a brief bio of each, as well as documents and artifacts that illustrate their experiences during the war.

I’d like to do something similar with my project, although I hope to arrange it in somewhat chronological order rather than by person. It helps that each soldier’s period of activity in his respective collection already falls into order, more or less. For example: Ambuehl (training camps), Cahill (trenches and early wounding), Ochs (work with the Stars and Stripes and post-war military career). I need to decide whether to include a fourth section on Ambuehl’s death or include that in the first section.

Week 9 – Cahill Collection and Exhibit Prep

This week I completed scanning the Cahill Collection and created the .csv metadata spreadsheet, which included a transcription of Cahill’s letter to his sister.

It was the letter that sparked the beginning of my planning for my online exhibit. Some of Cahill’s descriptions of the front are truly heartrending, and I couldn’t help thinking about Ambuehl and Ochs, and where their experiences paralleled and diverged. Ambuehl surely experienced some of the same sights and sounds and fears and annoyances as Cahill, but most of his letters home from his time in France were lost. Did he write some of the same things, or did he try to sugarcoat his experiences so his sister wouldn’t worry about him? And, unlike Cahill, Ambuehl didn’t make it home. He could have been one of the wounded that Cahill saw carried away from the trenches, the stretchers dripping with blood.

And Ochs, who often complained about not being stationed at the front but fought his own kind of war at the Stars and Stripes office–how would he have coped with the tragedies Cahill saw? One thing Cahill writes that really stuck with me was that the soldiers who complained at camp when things were easy were the most resilient on the front and encouraged the other men not to lose heart. While I tend to think of Ochs as an impulsive kid with a tendency to complain, maybe he would have held his own in the trenches after all.

What I’d like to do is focus on weaving together the stories of the three soldiers my internship has focused on so far (Ambuehl, Ochs, and Cahill) and show how the materials in WWPL’s collections reveal different aspects of soldiers’ experiences during WWI. The Ambuehl collection focuses on military training camps, as well as the death of a soldier, while the Cahill collection documents a soldier’s experience on the battlefield and the Ochs collection follows a soldier’s career away from the trenches.

I began a rough timeline of events in the solders’ lives as documented in the collections, and I’ll continue thinking about how to arrange the exhibit and what materials to include. Once I have a better outline for the exhibit, I’ll present it to Mark.

Week 8 – Bouman and Cahill Collections

This week I finished up the spreadsheet for the Bouman Collection and started in on scanning the Cahill Collection.

This collection is a small packet of photos and one letter by Edward Cahill, a WWI soldier. Most of the photos are of Cahill and his wife post-war, but there are three photos of a young Franklin D. Roosevelt with soldiers at the beginning of the war. I’m not sure if Cahill is in these photos or not, it’s hard to tell. The letter was written when Cahill was at Walter Reed hospital, and when I read it I learned that he was one of the first American soldiers wounded in the war. The letter is to his sister, describing his experiences in France.

Week 7 – Ochs and Bouman Collections

This week I wrapped up the Ochs Collection spreadsheet, with help from Mark, who wanted to help me get ahead so I’d get some experience in other areas.

I made good progress in one of those other areas through working on the Bouman Collection, one of my first digitization projects from my volunteer days at WWPL. Although I scanned all the letters from the collection to multipage PDFs a few years ago so the donor could have a digital copy, they had not yet been uploaded to the Omeka site. I asked Mark this week if I could contribute to digitizing this collection, because I have the knowledge of the collection that would help get the work done quickly, and because the bulk of the collection will provide great material for the library to use during the upcoming centennial of the 1919 Paris Peace Conference.

Jon Anthony Bouman was a British AP correspondent who was working in Europe during WWI. The bulk of the collection is letters from Bowman to his wife and children from Paris during the 1919 peace conference and from Germany during the 1920s. While the subject is a little off course from my project topic, the letters offer great insights into both the peace conference and the cultural atmosphere of postwar Europe.

In addition, I got to experience a stage of the digitization process that I had not worked with before. Mark showed me how to upload the PDFs to a host site that provides Omeka with URLs. I completed this stage of the process and also got through about half of the .csv spreadsheet of metadata for the collection.

Week nine: 03/19-03/23

I ended up having to work from home this week because of some pretty heavy snows, which ironically hit us right after the first official day of spring. I continued working on my finding aid for the Race and Segregation collection, formatting the citations and information on each piece in the collection. However, I’ll have to put that one on hold until I can get back to the collection and continue cataloguing it.

I spent the rest of the time working on transcribing documents as usual. I transcribed one cablegram laying out the terms of the naval armistice during the first World War. This document was poorly scanned (or photographed? I can’t tell actually) so it was really hard to make out in places. Many of the black, block printed words had bled together to form solid black lines, and one side of the document was shrouded in darkness. But considering the poor quality of the image, I think I still managed to make out a good deal of it. It doesn’t seem like I’ll be needing glasses anytime soon!


Week eight: 03/12-03/16

After a very restful spring break, I’m back at the library. We finally decided to retire the cloud drop and go back to transcribing things by hand. But now I’m starting a new project which is a bit more interesting.

Now I’m working on a Library Guide for the Race and Segregation collection. This is essentially a catalogue of the items in the collection, with a section at the beginning to help historicize and contextualize the collection.

I enjoy getting to actually do a bit of research for the first section. I’ve always liked looking for things and trying to piece together bits of information. I like the challenge and the satisfaction of actually learning something of value, making arguments, and finding the evidence. It’s too bad my biographical notes section can only be a couple of paragraphs.

The rest of the library guide isn’t all that exciting. Cataloging is slow work; important, but very slow. I’m now going back through the excel sheets I made earlier in the semester to get the information I need for the catalogue.

I’ve been listening to podcasts to help keep the mind fog at bay, which creeps up easier than you think. I’ve come across some very humorous ones that have left me giggling at my desk. I’m sure the others think I’m crazy, just sitting at my desk going over a collection about race and segregation giggling. Little do they know that actually I’m listening to a story about a woman’s pet bird that is hated by the rest of her family, which features a recording of it screeching. Who knew that Macaws could sound like the velociraptors from Jurassic Park? The idea of this little bird creating such a chilling sound is extremely amusing to me.


Week seven: 02/26-03/02

I kept up with the transcribing this week. The documents are coming along, but the cloud drop isn’t proving to be as helpful as we had hoped. One problem with it is that it won’t accept PDF files, only Jpegs. So that means I have to take every PDF and convert it into a Jpeg using Photoshop. While this is an easy enough conversion, it’s still really time consuming and not all that efficient.

Photoshop can only convert one page of a document at a time, and you have to do some extra steps to keep all the pages together. I managed to get all the pages together by creating a custom panorama, where you can stitch together panels in any order you like. However, I encountered another snag because the cloud drop can only handle about 25KB images, and panoramas are about five times that.

So now I’m back to transcribing one page at a time. For a seventeen page document, this gets tedious very fast. The computer still doesn’t do that great of a job at transcribing, and if the image is bigger than 25 KB it will only transcribe about half the page. So I’m still spending copious amounts of time editing and transcribing myself.

I had high hopes for the cloud drop, and for the smaller documents it did do pretty well. But I think I’ll stick with transcribing things myself, just so I can save some time.

Week six: 02/19-02/23

This week I began working on transcribing documents using Google cloud drop. This Google app has text recognition software that’s supposed to be more advanced than Adobe Acrobat, and will hopefully make our transcriptions go by faster. While it can’t recognize anything that’s handwritten (at least in cursive), it does very well at recognizing the text of several cablegrams and typewritten documents we have.

It may seem pointless to be transcribing documents that are written in print. The majority are clearly legible, and often rather short, so why transcribe them? While we may be able to read them with ease, the vast majority of computers and search engines can’t because the text is not in a format they can understand.

Let’s say you’re trying to find all documents that contain the phrase “safe for democracy”. If you type this in the search engine you will get varied results, and probably not many documents containing that phrase, unless they’ve been transcribed. This is because computers use the language of 1s and 0s. Everything you see on a computer screen is actually some combination of those two numbers, which tell the computer what to put up on your screen. Every letter in this post is also a unique combination that allows the computer to present you with a language you understand.

But most computers can’t understand words written on a scanned page, because that image does not have the underlying code that translates it for the computer. This is where transcription comes in.  Transcribing is as much for the computer’s readability as yours, and by typing out the contents of a letter into a computer, the letter then becomes text searchable.

So if computers can’t read words the same way we do, then how can Google cloud drop do it? The answer is through API, or Application Programming Interface. This is a learning software that helps develop apps, and can teach computers how to recognize things not written in computer code. A programmer creates a code that explicitly tells the computer what certain letters look like, how to recognize letters that are grouped together to form words, and even some specific words. The computer is then put to work.

However, the computer only has the basics and can’t always recognize all words. Sometimes it will still mistake certain letters for others, especially if they’re smudged or slightly too close together, causing some pretty wacky transcriptions. For example, if I see the typo “Ihope to see you againsoon”, I know that “I” and “hope”, and “again” and “soon” are meant to be seperated, but a computer will think it’s one word, or blend a few letters together like this: “agaiÑoon”.

But because it’s a learning software, you can actually correct its errors and teach it what the correct thing is, and the computer will remember that for next time. If you repeat this process enough times the computer will eventually be able to do it without help. This learning curve is what keeps many website builders from using it, except as a seperate application that won’t affect their own platform.

At the WWPL, our cloud drop is still in need of lot’s of editing, and can only handle documents under 25 KB. But it’s still saved me a bit of time on transcribing. Instead of spending 30 mins on a document, I’m spending about 15 just editing. So that’s an improvement.