Inferred Segment Matches - Thu, 7 Mar 2019
When we match our DNA to other people to find common ancestors, we are comparing segments of DNA that match the other people. That’s only logical, isn’t it.
Well, interestingly enough, there’s a technique that will help you determine which ancestors your DNA comes from by using non-matches. Actually, you are using matches of people you are closely related to, and finding common relatives who they match to, but you don’t.
Jonny Perl, the author of DNA Painter, recently wrote an article about this technique titled Painting your DNA with inferred matches. I believe he is the person who named it “Inferred Matching”. (Please correct me Jonny if this is not the case.)
Jonny gave examples showing how he used:
- His dad’s matches with a 2nd cousin once removed that he did not match
- His dad’s half-brother’s matches that his father matches but he does not match
- His mother’s paternal cousin, and his second cousin.
- Siblings
The basic idea behind Inferred Matching is that it works because you know you got your parent’s DNA either from your father and your mother. And each (small enough) segment you got from each parent was either from grandfather or grandmother. What you do is find another close relative, who I’ll call Person B, who matches a third person who I’ll call Person C. If Person B matches Person C on a segment, but you (Person A) do not match Person C on that segment, then you couldn’t have got your segment from the same line. If you did, it would have matched.
So Inferred Matching basically tells you the ancestral line your segment did not come from.
Looking at the diagram above, I show an example where I’m assuming your grandmother’s father (GM’s father) is the ancestral source of a segment. He passes it down through your grandmother, through your uncle/aunt to your 1st cousin (Person B). He also passes it down to your more distant cousin (Person C). If he passed the same segment down to you as well, then you and your two cousins would all have the same segment, your segments would all match each other and you therefore triangulate. The triangulation is a clue that all three of you may have been passed down that segment from a common ancestor.
But what if your two cousins match each other, but you don’t match? You know you couldn’t have got the segment from your GM’s father. So who could have given you the segment? Answer: the segment you got from your parent could have instead been passed down from your GF’s father, your GF’s mother or your GM’s mother.
So you usually can’t directly tell which line you came from with Inferred Matching. In the above example, you still don’t even know if the line is from your grandfather or grandmother’s side. But it does tell you the one line that you don’t come from.
Alone, you can’t do too much with it. But combined with other information, you can. If you find another cousin, who matches someone else on your grandmother’s side that you also match, but not on that segment, then you have a second refutation. If that refutation is, say, on your grandmother’s mother’s side, then all of a sudden you have refuted both your grandmother’s parents, and your segment should be on your grandfather’s side. Then if through yet another pair of cousins, your infer that the segment cannot be on your grandfather’s father’s side, all that remains is your grandfather’s mother’s side, and that could very likely be the ancestral path for your segment of interest.
Who Can Be Used for Inferred Matching?
Persons B and C can be anyone who is related to who share a Most Recent Common Ancestor (MRCA) with you. You must match Person B and Person C somewhere, but it’s the segments that you don’t match one or both of them that can be used for inferred matching. The ancestral path through the MRCA that is closest to you is the one that you can refute, because you cannot continue to follow up that path to the further MRCA. If you did, then you would be matching on that segment.
Using a parent or a parent’s descendant as a Person B is wonderful. With a parent, sibling, nephew or niece, you are now dealing with only two possible segments that you can receive rather than four. Because of that, Inferred Matching of segments your parent or half-relative’s matches that you don’t have will always tell you that if your match is not through your parent’s father, then it must be through your parent’s mother (and vise-versa).You will need to know the MRCA of Person C so you can determine which grandparent the non-match will be on. Jonny’s article gives excellent examples of this.
Caveat
Of course nothing’s ever perfect. If your Person B or Person C is related to you more than one way, e.g. through both of your grandparents, then you could get incorrect results. But this should be a somewhat rarer case. Normally, Inferred Matching works and works pretty well.
Visual Phasing
Inferred Matching has been used before Jonny’s paper. The technique of Visual Phasing takes the matches of 3 or more siblings and compares them. In doing so, the segments of each sibling’s DNA that came from each grandparent can be determined. Visual Phasing has been around for a few years. Part of the technique involves refuting a grandparent on a segment, which is effectively Inferred Matching, but I’ve never seen any posts about Visual Phasing referring to the term “Inferred Matching”.
Inferred Matching and Double Match Triangulation
Doing Inferred Matching manually is laborious. For any segment, you need to find all the segment matches that your known relatives have with each other that you don’t match to. Then you must logically work out what ancestral paths back to the MRCA’s are possible and see if you can eliminate some paths from possibility and thus infer the paths that are possible.
Inferred Matching works well with the ideas behind double match triangulation.
Double matching involves finding all the segment matches of Person A with Person C and compares them to all the segment matches of Person B with Person C. Those that overlap (along with A’s segment matching B’s) are triangulations.
Inferred Matching uses the complementary information available in the data used for double matching. Inferred Matching uses the segment matches of Person B with Person C where Person A is not matching either Person A and/or Person B on that segment.
I’ve been working on implementing Chromosome Mapping into what will be Version 3.0 of Double Match Triangulator. I’m also incorporating Inferred Matching into that. In Double Match Triangulator, an inferred match will be telling you what ancestral paths cannot occur, and will look like this:
The green sections are triangulations that Person A and Person B have with several C Persons. In the example triangulation group, the MRCAs of the C People who triangulate are not known. The ancestral path (MM = mother’s mother) is only known from Person B’s MRCA.
An inferred match is shown on the first line and states that Person A doesn’t have the B-C match and the ancestral path cannot be MMFF. So only MMFM, MMMF and MMMM are possible. If additional Inferred Matches are found for that segment that rules out more of the possible paths, then Double Match Triangulator may be able to extend the ancestral path of the triangulations to longer path when it becomes the only possibility. This can provide extra information that wouldn’t have been available without the Inferred Matching.
Bonus: Inferred Matching on Triangulating Segments
Look at the 3rd line in the above diagram. This is a triangulation, but to the right there are 5 grey B’s. That is a section of the double match that Person A no longer matches. Person A stops matching at the last green T. But Person B continues matching Person C for 5 more Mbps (Mega base-pairs).
Inferred Matching can be applied to those 5 B’s. Person C has an ancestral path of “MM”, meaning that this segment can no longer be from the MM ancestral path. What we have found is a crossover at the end of that triangulation group belonging to Person A. These additional Inferred Matches are also being identified and will be displayed and used for ancestral path determination in the upcoming version 3.0 of Double Match Triangulator.
Of course we have to be careful not to use too small segments. There can always be some random matching at the beginning and end of any match, so we must make sure that the B-C matching preceding or following a triangulation is significant.
Double Match Triangulator 3.0
I’ve been making good progress and I will release DMT 3.0 as soon as it is ready. There have been so many great advances in DNA analysis over the past six months with clustering and new tools and especially new features at Ancestry DNA and MyHeritage DNA announced at RootsTech that I’ve been following. All of these have redirected my thinking as to what’s needed. I’ve established that the tool that is now needed is one that will help people do Chromosome Mapping by applying and automating the rules for them so they don’t have to do it themselves. The results will then be made available to you so that you can input them into DNA Painter and other tools.
I’m very excited as to what I have programmed so far. Most of what I’ve talked about above is completed in my development version. This post was mainly to document some of my thoughts about Inferred Matching, but is also meant to be a teaser as to what’s coming in DMT 3.0.
Stay tuned.
A Second Type of Inferral
It’s amazing as you work through the details of something and try to implement it programmatically that you suddenly realize something. I shake my head sometimes as to how the mind works, but it somehow connects all the dots together all by itself and suddenly this idea pops into your head.
The type of inferral that Jonny Perl wrote about and that I was writing about up to now is an inferral you can make because a close relative matches to someone on a segment, but you don’t.
What about the other way around? It works too. You can infer in a similar manner from a segment match that you have, but a close relative doesn’t.
The simple case of this is when you match someone on a segment, but one of your parents doesn’t. I like to call this "Parental Filtering”. Almost all the time, that will mean that either you match through your other parent, or the segment is false.
There is the borderline case where your parent falls under the match limit but you don’t. But in that case, you’ll still want to eliminate that segment from your analysis because you can’t say for sure that it is a segment going through that parent.
People do this parental filtering all the time, especially when they only have one parent tested. But you can also use siblings (as in Visual Phasing) to infer grandparent lines that you can’t have. And similarly you can use other segments that you have that some close relatives on those lines don’t have to infer more lines that you can’t have. And once you have all lines covered (e.g. both parents or all four grandparents), then you can start to classify segment as likely to be false.
I am now working to incorporate this second type of inferred matching into DMT. We’ll soon see how well these two methods of eliminating possible lines work to help identify the ancestral path that the segments of your DNA came from.
Followup (3 hours following my post): Blaine Bettinger wrote on the Facebook Genetic Genealogy Tips & Techniques group that there are several names for this process. Blaine says he uses “Indirect Mapping”.
Revision: Mar 10: Nearly complete rewrite. On Facebook, Jonny Perl and Stevlana Hensman pointed out a major oversight I originally made in my article. I had thought that the Inferred match always resulted in knowing the ancestral line that your segment came from. That is only the case for parents and descendants of your parents (siblings, nephews/nieces, etc.). For anyone else, all it does is tell you the one ancestral line that your segment did not come from. That is still very useful information, however, and needs to be automated in DMT so that people can make use of it.
These concepts are brand new and are still being discovered by the genetic genealogy community. They are not simple. I am still learning myself and my head still spins every time I try to map a how DNA is shared. I appreciate all feedback as peer review is the best way to confirm, correct and improve methodologies.
Followup: Mar 14: I’ve confirmed that you can infer the grandparent when the inferred match is made through your parent or a descendant of your parent (i.e. siblings, nieces/nephews, etc.) The reason is that your parent gets one chromosome of each pair from each grandparent that only comes from two of your great-grandparents on that parent’s side. If you do not match to one, you must match to the other.
This does not work for uncles/aunts, 1st cousins, or other relations, because they need not have got the same grandparent segment that your father did. So for them there are four possible great-grandparent segments to choose from. You can eliminate one, but without further eliminations, that still leaves two on one grandparent’s side and one on the other.
Followup: Mar 15: I added the section at the end: “A Second Type of Inferral”
Followup: Oct 6, 2020: Blaine Bettinger gave a webinar on FamilyTreeWebinars about this technique. He now prefers using the term: “Deductive Mapping”.