![]() |
|
These matters are not as straightforward as they might appear at first glance. Firstly, the registers are in most cases several hundred years old and have not been kept in the best of environments; so they are often faded, covered in water marks and occasionally damaged. Secondly they will have been photographed, for few of us get to actually handle the original documents today. The process of photography has introduced issues of focus and alignment that will make reading more difficult. (The registers are usually small bound documents that do want to lie flat for their pictures to be taken!) Thirdly, the original records were written with different instruments; usually a quill pen of some form that gives distinct characteristics to the letters and words. Forthly, the education and writing skills of the vicar or person writing the record varies significantly; sometimes there will be good penmanship other times one wonders. Lastly, the alphabet and graphics used for that alphabet are different from those in use today and have changed over the past 500 years. All of these factors means that there will be significant uncertainies in what you see and can interpret from a register.
As a result, we have develeoped a set of transcription rules that are designed to allow permit you to convey the maximum amount of information into the record and allow the search engine to extract the best solution from the infomation base. (Just one word of caution, the search engine is not currently able to make use of all of the transcription rules but it does remain under development). So lets explore these two aspects in the reverse order.
For those of you who have transcribed for other projects the first and last rule for transcibing is well known. You transcribe what you read; errors and all!!!!! It is to be left to the researcher to make a correction or adjustment for his or her own purpose. Your job, as job one, is to tell it as it is; nothing more, nothing less. (It is worth noting that this is a change from the original specification for FreeREG where transcribers were encouraged to convert the old texts into the modern usage) In the next section on reading the registers we will try to assist you in extracting the maximum amount of useful information.
Some common types of uncertainty that you are likely to encounter in your first few batches of transcription, and the technique to use for each of them, are given in the table below. The section after the table describes each of the formats.
| Uncertainty | Which Uncertain Character Format to Use |
| Can't tell if it's an l or a t | Use the [lt] style of UCF |
| Can tell how many letters I can't read | Use the _ style of UCF, one _ for each letter |
| I think I can read the letter | Use the [x_] style of UCF, where x is what you think the letter is |
| It's 2 or 3 letters I can't read | Use the _{2,3} style of UCF |
| Don't know how many letters I can't read | Use the * style of UCF |
| Not sure if that's a letter or an ink blob | Use the _{0,1} style of UCF |
| Not sure of the word transcribed | Use the ? style of UCF |
_ (Underscore) |
A single uncertain character. It could be anything but is definitely one character. It can be repeated for each uncertain character. |
* (Asterisk) |
Several adjacent uncertain characters. A single * is used when there are 1 or more adjacent uncertain characters. It is not used immediately before or after a _ or another *.
Note: If it is clear there is a space, then * * is used to represent 2 words, neither of which can be read.
|
[abc] |
A single character that could be any one of the contained characters and only those characters. There must be at least two characters between the brackets. For example, [79] would mean either a 7 or a 9, whereas [C_] would mean a C or some other character.
|
{min,max} |
Repeat count - the preceding character occurs somehere between min and max times. max may be omitted, meaning
there is no upper limit. So _{1,} would be equivalent to *, and _{0,1} means that it is unclear if there
is any character.
|
? |
Sometimes you will have the situation where all of the characters have been read but you remain uncertain of the word. In this case append a ? at the end of the word e.g. RACHARD? The most frequent place where a ? is used is with transcription that have been donated from other systems and are being converted for entry into FreeREG. |
Note: Using a single * is preferable to spending a long time trying to decide the min and max values to use in the _{min,max} format, which is more precise.
Technical note: Although this UCF format has many similarities to regular expressions (e.g. Perl, Unix) it is not identical and in particular there is no escape mechanism.
Reading a Register
Your first reaction on looking at a register, especially the older ones, may well be to ask yourself how am I ever going to make sense of this. Your second reaction may be to ask yourself why am I doing this. Your third reaction might be to throw up your hands and walk away. Please don't. You are engaged in one of the most important activities designed to help all of us research our forebearers. So please bear with it. In the following sections we will try to help you make sense of what you see. After a while you will come to recognise that old writing and surprise yourself at how good you have become. Also don't forget you can use that Uncertain Character Format to deal with the problem entries and move on.
The Alphabet and its Graphical Representation
One of the biggest issues is how to read 16th centuary writing. Well I am no expert and there are several resources on the internet that you may want to have a look at.
National Archives Tutorial
Scottish Handwriting
Geneology Handwriting
Old Handwriting
In the following paragraphs I will highlight some of the issues as I see them. Then if you want to go and have a look at some of those other resources.The following table gives an excellent redition of the early alphabet and how people of different backgrounds wrote their text.
The first important thing to notice is that there were no seperate characters for u and v. From the 1630s onwards, printers started to use the u letter-form (or 'graph') to denote the vowel, and the v graph to denote the consonant. Before this time there was only one recognised letter of the alphabet, which could be written or printed in two ways. This is why the letter w is not called 'double-u' and not 'double-v'. Printers before the 1630s used v initially (at the start of a word) and u medially. Practice in manuscript was never this consistent, with u and v graphs being used for both consonant and vowel, both initially and medially. Ambiguities caused by this system can make life difficult. It is important that you don't lose information by deciding too soon whether a u or v graph encountered is the vowel or consonant. Your job in transcribing is to report exactly what is there in the register, so u and v forms must be distinguished from each other where possible and not silently or unconsciously brought into line with modern practice.
A different case is the letters i and j. As late as the nineteenth century, some still insisted that j was just a variant form of the letter i, which could represent both a vowel and a consonant. But many tried to use the j form for a consonant and the i for a vowel. Again, your job is to record what you see, which will in most cases be a letter i.
The 's' is especially problematic. It has both long 's' and short 's" forms. The long 's' is usually clear at the start of a word eg
. But don't get the long 's' and 'f' mixed up in a word eg
and
. Normally the 'f' will have a cross stroke, even if it's hardly noticeable, and the context will make it clear whether it is a long 's' or an 'f'.
The terminal 's' tens to fall between the two forms. See for example,
,
and
. Also look at the capital'H' in the last case.
Within a word the double is is written with a long 's' followed by a short 's'; looking like an fs.
Remember that in secretary hand the lowercase 'c' looks exactly like a modern day 'r'.
The lower case 'e' tends to not have a central stroke, so can look more like a 'c', or an 'o' if it is biting with the next letter. See for example,
Also note the use of double 'f' which is a capital 'F'. See for example,
and
. It would be easy to mistake these as a modern 'H'.
There are two forms of lower case 'r', the '2' shaped one which occurs after 'o', and the long 'r' which descends below the line. The long 'r' can consist of no more than a single downstroke, with no horizontal stroke at all. This can make it quite hard to distinguish, particularly when combined with a preceding 'e'.
Use of 'es' for genitive, rather than apostrophe and 's'. For example, kinges . It may look like there is an apostrophe after the 'e', but what you can see is actually part of the letter 'e', called a 'horn'. See for example,
Note also the nature of the capital 'R' in this example.
The abbreviation sign that means characters have been omitted. This is a dash over the preceding vowel(s). The context will make it clear which letter(s) it is. See for example,
and
. Note also the nature of the capital 'R' in both these examples. In these cases the missing letters are inserted. (This is a deviation from normal practice caused by our use of the Uncertain Character Format)
Hints and Tips.
Last update
12 Dec 2006 DKD