Why is it important to transcribe the documents you consult ?

Why do we need to transcribe the documents you consult?

Today, thanks to photographs and other scanners, it’s easy to keep a digital copy of the documents you consult, and it can be tempting not to transcribe the document’s content.

Why should you transcribe a document of which you have an image?

From an IT point of view, images and text are two file formats with very different characteristics. Images are just points, and cannot be used as they are to search for information. This is why, to use a photographed or scanned text, it must be retranscribed into text.

Are there any tools available to do this?

For images of typewritten text, even if character recognition technologies exist to retranscribe them into text (OCR), the results of these technologies are only convincing with images of typewritten text.

As for handwritten texts, AI text recognition is still in its infancy, and the few tests that have been carried out show that it is not yet possible to go from a text written with a pen and in an ancient language to a typed text using a computerized tool.

So should transcriptions be made manually?

We’re entitled to ask ourselves this question, since the time required to transcribe a text is not insignificant!

There are two possible approaches:

  • A minimalist approach, which consists of noting down only the most important information found in the document, preferably using a form adapted to the type of document (birth certificate, marriage certificate, etc.).
  • A maximalist approach: retranscribe the entire text.

The advantages of the minimalist approach are obvious: it saves time and transforms text into structured data (date, place, people…) that can be transferred to genealogy software. On the other hand, it’s possible to consider a piece of information as unimportant at the time, when in fact it will be important a few months later (e.g.: a witness has a surname that is linked to the genealogy we’re working on).

The advantage of the maximalist approach is that it allows you to search the entire text with computerized tools (FULL TEXT searches). On the other hand, the information is not “structured”, and it will be more difficult to exploit the data for processing. And don’t forget that the spelling of names can change over the centuries and from one document to another.

Our advice

Our advice is to work with the minimalist method, provided that you keep and classify the text obtained with the digitization of the document consulted, so as to be sure of being able to refer to it if necessary.

Did you know that there’s a solution for checking your data and identifying unsourced or missing information in just a few minutes?

Don’t waste your time checking that your information is valid and consistent, and manually listing your next searches.

This article is part of our series on Questions (and answers) for the genealogist.
Subscribe to our newsletter and receive a free article from this series every month.

 

Navigation dans la série<< Why is it important to verify your information?

Philippe.D (créateur de GeneaSofts.Com)

Genealogy enthusiast for over 30 years, I wanted to provide genealogists with simple, innovative software to help them with their research. Follow me on social networks.