A couple of years ago, I introduced the horrible acronym BGRN to represent a new notation for DNA relationships which I extended to also include non-genetic relationships. Using this notation, one can define precisely how one person is related to a second person. Using just the notation, I can programmatically determine the expected amount of DNA shared between the two people (autosomal, Y, X and mt), and can express in English how the second person is related to the first.
e.g. YXY(YX)xy = male person’s mother’s fathers’ sister’s son.
Back then I decided to make it a universal (non-English-centric) notation using the DNA X for a women and Y for a man, using an uppercase letter for going up to a parent and a lowercase letter for going down to a child.
I was working to implement it into Behold last year, when I got diverted into Double Match Triangulator (DMT) development. Currently, I am trying to finish off DMT version 3.0 and I have found a need for the notation in DMT.
But as I was doing so, I realized something. If a computer is going to handle the notation, then a set of X’s and Y’s and x’s and y’s works fine. It’s not quite as good when people need to be able to read, enter and understand these values. In DMT, I’m going to allow people to enter the relationships of any of their matches that they know. People are not going to want to enter YXY(YX)xy. It is not simple enough and not understandable enough.
So here is my new version of Behold’s Genetic Relationship Notation (BGRN). It is an English-based version (sorry non-English speakers) that uses the initial letters of recognizable English words to designate the genetic connection.
For example, our YXY(YX)xy will in this new notation be: MFRDS, which translates to: “the person’s mother’s father’s parents (both of them) daughter’s son. All the letters are uppercase. The “R” represents the paiR of paRents for both the F(ather) and the D(aughter) and indicate that the F and D are full siblings sharing both parents. Using a single letter R rather than grouping F and M together eliminates then need for parenthesis as the (YX) had.
Let’s now define all the rules, as I did in the earlier XY version of the notation:
The Behold Genetic Relationship Notation (BGRN) Revised
Behold’s Genetic Relationship Notation defines a string of characters that represent how person A connects to person B. With this string and the sex of person A, you should be able to:
a) Determine the expected amount of DNA shared by the two people, and
b) Describe the relationship in words.
The basic genetic notation uses the following characters to make up the string:
- F = father
- M = mother
- P = parent of unknown sex
- R = pair of parents, to represent a pair of Common Ancestors (CA)
- S = son
- D = daughter
- C = child of unknown sex
- T = identical twin, e.g. FT means the “T” is the identical twin of the "F”.
- B = boy, optional, only used in position 1 if the starting person is male.
- G = girl, optional, only used in position 1 if the starting person is female.
- ? = rest of the connection is not known
That’s it. 10 uppercase letters and a question mark in this revised notation, compared to the 4 uppercase and 4 lowercase letters, a number, a hyphen and parenthesis of the original notation.
The sex of the starting person is optional. If included, it will be the first character of the string. This may be needed for some genetic analyses to allow determination of whether the Y or X chromosome is possible to be shared between the starting and ending people.
The core rules of the revised notation, for purely genetic relationships, are:
- The string optionally starts with B or G.
- This is followed by 0 or more of: F, M, P.
- This may be followed by one R or by one T
- This is followed by 0 or more S, D, C.
- It may end in a ?.
Below are some examples of the notation for genetic relationships and the full relationship in words (plus a simplified relationship in parenthesis) that can be generated from it:
BMF = a boy’s mother’s father (or maternal grandfather)
MFR = a person’s mother’s father’s parent’s (or great-grandparents)
BMFRDS = a boy’s mother’s fathers’ sister’s son (or 1C1R)
FDDDD = a person’s paternal half-sister’s daughter’s daughter’s daughter
(or half-great-grand-niece)
GSS = a girl’s son’s son (or grandson)
GFTD = a girl’s father’s identical twin’s daughter (or niece).
See how much easier these are to read and interpret their representation in my original version of this notation, which was: YXY, UXF(YX), YXY(YX)xy, U(Y)xxx, Xyy and XY2x.
Here’s examples of some common relationships:
M = mother
MM = maternal grandmother
PPPM = great-grandmother (unknown side)
PRD = aunt
PRCC = 1st cousin
PRCCC = 1st cousin, once removed (1C!R)
PPRCC = 1st cousin, once removed (the other way)
PRCCCC = 1st cousin, twice removed (1C2R)
PPRCCC = 1st cousin, twice removed (the other way)
PRD = great-aunt
PPRCCC = 2nd cousin
PPRCCCC = 2nd cousin, once removed (2C1R)
PPPRCCC = 2nd cousin, once removed (the other way)
In the above examples, any of the P’s can be replaced by F’s or M’s, and any of the C’s can be replaced by S’s or D’s.
Hopefully, you’re getting the idea and this seems easier to read than trying to decipher a string of uppercase and lowercase X’s and Y’s.
I won’t go into the calculation of how much DNA is shared since it’s worthy of another post, but let me say that the expected values can be easily obtained from strings written in Behold Genetic Relationship Notation along with the sex of the starting person.
Extending the Notation to Non-Genetic Relations
I still would like to extend this notation to handle more than just Genetic relationships and include all possible genealogical relationships. So let’s define the additional notation:
- f = non-genetic but legal father
- m = non-genetic but legal mother
- p = non-genetic but legal parent of unknown sex
- r = non-genetic but legal pair of two parents
- s = non-genetic but legal son
- d = non-genetic but legal daughter
- c = non-genetic but legal child of unknown sex
- h = husband
- w = wife
- z = spouse of unknown sex
- n = unmarried partner of any sex
The nice thing about this is all these non-genetic relationships are lowercase. So that means that as soon as you see a lowercase letter in a relationship, then you know the genetic link is broken and there will be zero DNA shared for this connection.
Examples of the extended notation and the relationship in words that can be generated from it:
n = Person’s spouse.
RD = Person’s sister.
nRDcF = Person’s spouse’s sister’s adopted child’s father.
MMMhDhMh = Person’s mother’s mother’s mother’s husband’s daughter’s husband’s mother’s husband.
RDSFSwRCz = Person’s sister’s son’s paternal half-brother’s wife’s sibling’s spouse.
So BGRN can handle any relationship, no matter how complicated.
And if you notice, I’ve been careful to only include consonants as the letters of the notation. If any vowels would have been included, it would have been possible to create some relationships that would be real words in English, and that is risky as some not-so-desirable words could appear.
I am interested in hearing any and all comments, criticisms and suggestions.
—
Update: Sept 3, 2018: I made the change of a pair of parents from “B” or “b” to “R’ or “r”. The “B” was taken from “both parents”, but that phrase does not read well when you string them together as in: “father’s parent’s both parents’ son”. So the “R” now is more indicative of “paRents” and the phrase FPRS can now be generated as: “father’s parent’s parents’ son”. Notice the subtlety of the apostrophe before or after the “s” in parents to indicate if there is more than one. If that is too subtle, it could be translated to “father’s parent’s pair of parents’ son”. Because of this change, I also had to change the spouse or common-law partner of unknown sex from “r” to “t”.
—
Update: Nov 20, 2018: I added the optional B and G at the start of the string to indicate the sex of the starting person if that is relevant to the relationship (e.g. for X or Y chromosome purposes).
—
Update: Jan 2, 2019: Changed spouse or partner of unknown sex from “t” to “n” so that “t” will not be confused with “T” which is for identical twin.
—
Update: July 5, 2022: Changed “t” to be unmarried partner of any sex and added “z” as spouse of unknown sex.
—
Update: July 14, 2022: At various times, I’ve seen people ask how to distinguish the two types of removals of cousins, e.g. 3C2R. I ike 2U3C to go from the current person up 2 generations and across 3 cousins. And 3C2D to first go across 3 cousins and then down 2 generations.