Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

Sex in GEDCOM - Thu, 14 Jan 2016

I have come across a need to check out the SEX tag in GEDCOM. Some of the new DNA features I’m finishing up for the next version of Behold make important use of the sex of the individual. Determining autosomal, X, Y and mitochondrial DNA shares between two individuals is much less accurate when the sex of anyone in the relationship line is not known.

GEDCOM includes sex quite succinctly as a level 1 tag of an individual defined like this:

+1 SEX <SEX_VALUE>   {0:1}

where

SEX_VALUE :=    { Size=1:7 }
A code that indicates the sex of the individual:
      M = Male
      F = Female
      U = Undetermined from available records and quite sure that it can’t be

A few oddities already. It appears that only, “M”, “F” and “U” are allowed for the SEX_VALUE, and I’ve never noticed a program that doesn’t adhere to this. But if you read carefully, it is not requiring that the value be restricted to these three. It is leaving the door open to other possibilities (what, I can’t guess at). I find it very strange to see Size=1:7 if only one-character codes are allowed. Why not Size=1:1?

Also, it is possible for the SEX tag to be missing, since {0:1} are allowed.

My interest from the DNA perspective is in trying to determine if possible, if the individual is male or female.

So let’s use rule number 1:

1. If the SEX_VALUE is “M”, the individual is assumed to be male.
    If the SEX_VALUE is “F”, the individual is assumed to be female.
    If the SEX_VALUE is anything else, or missing, then the sex is unknown.

If that was all of it, we’d be done. But there’s more.

Children have parents. Genetically, they always have a father and a mother, although that isn’t necessarily so for adoptive parents, foster parents, etc. Again, I’m going to restrict myself to DNA interest and assume that there is one male and one female parent, whether or not the parents are known or unknown.

In a GEDCOM file, each individual points with a FAMC tag to the FAM record that contains the person’s parents. An individual could have more than one FAMC tag and point to multiple FAM records. Only one of those FAM records can be the birth parents. All the other FAM records must each contain at least one non-birth parent.

If a person has multiple sets of parents, then it is important to know which parents are the birth parents. GEDCOM does not give any specific rules for ordering FAMC tags. It does give a rule for ordering CHIL (child tags) and states: “The preferred order of the CHILdren pointers within a FAMily structure is chronological by birth”. You would think then, that a logical extension would be that FAMC tags should also be ordered chronologically, with the birth parents always listed first. Behold already checks the “MARR” date and reorders the FAMCs when the dates are out of order. I don’t believe very many programs enforce FAMC order for their GEDCOM output as I’ve seen incorrectly ordered FAMCs in a good number of the test files I use.

The FAMC tag could have a level 2 PEDI tag under it which contains a PEDIGREE_LINKAGE_TYPE value, which is one of: “adopted”, “birth”, “foster” or “sealing”. If this tag is listed and “birth” is specified, then that FAMC tag should be listed first. Now we have more complications. We have to ensure that at most one FAMC tag for an individual has a PEDI tag with a “birth” value. In practise, I have not seen the PEDI tag used very often.

GEDCOM also allows (just to make a genealogy programmer’s job more difficult) a FAMC tag to be subordinate to an individual’s BIRT (birth), CHR (christening), or ADOP (adoption) tag. Here if a FAMC tag is subordinate to a BIRT tag, then the family should be the first FAMC. I have seen this used occasionally.

Okay. Now we’ve established to the best of our ability, the FAM record of the birth parents. Now we have to determine who the parents are.

I was going to describe the FAM record and HUSB and WIFE tags in much more detail, but I don’t have to because I’ll just point you to an excellent article that Tamura Jones just happened to publish earlier today: Marriage in GEDCOM

Tamura correctly states that the FAM record need not contain the HUSB or the WIFE tags. If not, well, then we just don’t know who that parent is.

My interest for my DNA purpose, however, is to determine each parent’s sex. The HUSB and WIFE tag will point to the parent’s INDI record, and the INDI record may have a SEX tag and we can use rule 1 (above).

But what if rule 1 results in “unknown”. Then should we be able to infer the parent’s sex by which one was associated with the HUSB tag and which one was associated with the WIFE tag? I’m not 100% sure yet. I believe, but I don’t know whether many programs enforce this association when exporting to GEDCOM. My next step will be to add a check into Behold that will see if the HUSB tag is pointing to a female individual, or if the WIFE tag is pointing to a male individual.

I would think the SEX tag of the individual (rule 1) normally should overrule the HUSB/WIFE tag pointing to the individual. So I would add rule 2:

2. if the sex is unknown from rule 1, then
    if only HUSB pointers point to this individual, assume he is male.
    if only WIFE pointers point to this individual, assume she is female.
    if both HUBS and WIFE pointers point to this individual, issue an error.

But if these are the birth parents, they cannot be the same sex. If these two rules result in both birth parents being assigned the same sex, then Behold will provide a message pointing out the conflict and indicate that for this case, it will assume the HUSB/WIFE tags to be correct.

If the individual’s SEX is specified, then rule 2 is not needed and the HUSB and WIFE pointers do not have to be looked at. But what if the HUSB or WIFE tag conflicts with the SEX tag? This is possible in the case of same-sex marriages, and assigning both individuals the same sex likely is a reasonable way of adding same sex marriages to a GEDCOM standard that many have said does not allow it. Two individuals can both be males or both be females. But two HUSB tags or two WIFE tags are not allowed. Therefore for a same-sex marriage the HUSB tag would point to one individual, and the WIFE tag would point to the other.

GEDCOM states:  “The family record structure assumes that the HUSB/father is male and WIFE/mother is female.” Note that it says “assumes”, and does not state “requires”.

So in GEDCOM, a same-sex couple could be represented as:

The family record is no different than normal:

0 @F1@ FAM
1 HUSB @I1@
1 WIFE @I2@

The INDI records for two males:

0 @I1@ INDI
1 SEX M
1 FAMS @F1@

0 @I2@ INDI
1 SEX M
1 FAMS @F1@

or for two females:

0 @I1@ INDI
1 SEX F
1 FAMS @F1@

0 @I2@ INDI
1 SEX F
1 FAMS @F1@

For more information about same-sex couples in GEDCOM, read Tamura Jones’ article: Same-Sex Marriage in GEDCOM

With regards to GEDCOM, I daresay that SEX is neither clean nor easy.

2 Comments           comments Leave a Comment

1. tamura (tamura)
Netherlands flag
Joined: Sun, 14 Dec 2008
2 blog comments, 0 forum posts
Posted: Fri, 15 Jan 2016  Permalink

You ask, why {size 1:7} if the allowed values are the one-letter values M, F, U?
Well, the full words for the possible SEX values are MALE, FEMALE and UNKNOWN, and that last word is 7 letters long.
The rest of the explanation is either the usual sloppy FamilySearch editing, or one FamilySearch’s systems actually used the full words…

2. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
288 blog comments, 245 forum posts
Posted: Fri, 15 Jan 2016  Permalink

Very likely! I missed that because GEDCOM 5.5.1 does not refer to “U” as “UNKNOWN” but as “Undetermined from available records and quite sure that it can’t be”.

GEDCOM 5.5 only allowed M=Male and F=Female and had size 7. It did not allow U or Unknown.

But GEDCOM 5.4 allowed M=Male and F=Female and U=Unknown

But GEDCOM 5.3 only allowed M=Male and F=Female and had size 7. It did not allow U or Unknown.

What a flip-flop between versions!

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?