Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Some DMT Dialog, with Questions and Answers - Sat, 12 Nov 2022

Michael Kaplan, a user of my Double Match Triangulator program sent me information about how he was using DMT and also asked a number of questions. I thought this conversation would be very good for other DMT users to read. Michael gave me permission to post the relevant parts of his email to me and to give his name. Thank you Michael.

Here is what Michael had to say and ask (shown in “quotes”), along with my responses in green.


The Dialog, Questions and Answers

“I have begun using DMT (on GEDmatch) in earnest.  I now have a total of 161 kits in my B folder, including your kit.  Note that I do have a number of unrelated people in the database, including my wife, our daughter from my wife’s first marriage, and a couple of adoptees that I have helped. I’ll also add unrelated matches of my own matches, in order to determine of which side of my match’s family our relationship might be.”

“I began the process by generating my own segment match file, then doing the same for each of my known cousins.  When doing the segment search, I left the default minimum segment size as 7 cM.  That’s recommended, isn’t it?  I don’t want to bother with false matches.  In addition, reducing that threshold would actually reduce the number of discrete matches captured, since the limit of 10,000 segments would be reached much sooner.” 

Yes. The default of 7 cM is recommended. Even triangulations can be false (matches by chance) if they are under 7 cM.


“I manage kits for quite a few of my cousins.  Others have done the upload to GEDmatch on their own.  Then assigned MRCAs, as appropriate.  I do manage a kit for one close match whose relationship is yet to be determined.  Renee Watson has ended up in the "FFF" cluster but haven’t confirmed the exact connection.“

“Is there a recommended next step?  Unfortunately, I don’t think I documented the exact path that I have taken. I’ll try to reconstruct my thought process.

  1. Find the closest match (person B) which has not yet been added to the B folder.
  2. Generate segment match file.
  3. Make initial DMT run to generate People file for person B.
  4. Re-run DMT on my People file (person A), looking for triangulations with known cousins (and other matches already processed)
  5. Run "Match both kits or 1 of 2 kits" on A and B, with shared matches first ordered according to shared DNA with A, attempting to find additional triangulations. 
  6. I’m running DMT with recommended settings of 7 cM for Min Triang and 15 cM for Single Triang.  Default settings for "Match both kits or 1 of 2 kits" are 10 cM for threshold of largest segment to qualify as a match, and 10 cM for threshold of total matching segments to qualify as a match.  Should I reduce both of those settings to 7 cM to match DMT?
  7. Then select the closest shared matches (ideal number TBD) for "Multi-Kit Analysis", running "Triangulation" on that subset.  Default setting for triangulation is 7 cM.  Should I leave it at that? 
  8. For each shared match C which is reported as triangulating with A and B, generate a segment match file and add to the B folder, if one has not yet been created. 
  9. If the segment match file had already been created for C, confirm that B is shown as a match in C’s People file.  If not found, generate one-to-one files for match between B and C.
  10. Then re-run "Match both kits or 1 of 2 kits", this time ordering shared matches according to shared DNA with B.
  11. Repeat steps 7-9.
  12. Now, look at tab on "Match both kits or 1 of 2 kits" for kits matching only B.
  13. Generate segment match files for the closest matches.
  14. Re-run DMT for person A (me) and person B.
  15. If tree for person B is available on GEDmatch, try to assign MRCAs on B’s People file
  16. Otherwise, examine source for GEDmatch kit (e.g., Ancestry, MyHeritage, FTDNA) and check if tree available on that platform.
  17. Return to step 1 and iterate through remaining unprocessed matches, periodically re-running DMT, as appropriate.”

Step 6:  No need to change default GEDmatch settings. The 7 cM and 15 cM settings in DMT are for individual segments. The 10 cM settings at GEDmatch are for longest segment and total of segments to determine which people to include as matches and does not filter individual matches. With endogamy, you might if anything want to increase the longest segment, rather than decrease it.

Step 7:  Default setting for triangulation is fine at 7 cM.

Step 12:  I’m not sure why you want kits matching only B. The point is to find the connections with Person A, so how will anyone not related to Person A help?

Steps 13 to 16:  I don’t see how using DMT for the B people is useful unless you want to determine Person B’s connections. Looking up Person B’s tree is definitely useful to see if you can find a connection with Person A’s tree, but there really is no need to find person B’s MRCAs and run them as person A in DMT if they are not your subject of interest.


“What’s the optimal procedure to determine the next kit to add to the B folder, given only the available kits in GEDmatch?”

I don’t know if there’s an optimal way. Using the closest match as you are doing is okay.


“How do we deal with a match with status "In Common With B" in assigned cluster "M"?  Do I need to find shared matches which will triangulate?”

“In Common With B” indicates that A has segment matches with C and B has segment matches with C, but none of the matches triangulate for A, B and C together. To find triangulations, you’ll need segment match files of other B people who have matches with C on other segments.  Matches will triangulate for everyone who is a true match of A on that segment on one of the parent sides.


“For status "Only AC matches", I need to also find shared matches which will triangulate.  Is that true?”

In this case, person C does not match any of your B people. The segments Person B matches with you should not triangulate with anyone else on both your father’s and mother’s side. If they do, then either the segment is false or your father or mother’s triangulation group must be false. As you fill in more and more of your segments with triangulation groups, you should reduce the number of “Only AC matches”.


“Matches with status "Has triangulation" in cluster "U" appear to match multiple other kits at the same location.  However, those segments have not yet been associated with a common ancestor.  Is that correct?  I assume that process involves more in-depth examination of family trees.” 

Yes. If none of a person’s segment matches have been assigned an ancestral path, then they will be put in Cluster “U”.

DMT can only assign ancestral paths to segments where the MRCA of Person B or C is known. So the goal is to include as many MRCAs as possible in Person A’s file, whose relationship with Person A is known and is only a single relationship (not related 2 or more ways). Second cousins and further are best since they will give you at least the grandparent level FF, FM, MF and MM.  It might sound like you’d want to go as deep as possible, e.g. 6th cousins, but the worry the farther out you go is that they might be related a different way that you don’t realize.


“While I am aware that I have a large family and many known cousins have already tested (and uploaded their data to GEDmatch), those kits do not uniformly cover all of my ancestral lines.  There are still quite a few gaps in my chromosome map.  Importing the chromosome map data DNApainter.com, on my paternal side, I’ve got 46% of 396 segments painted, including the following:

clip_image002

On my mother’s side, though, it’s only 13% of 131 segments painted:

clip_image004

Total is 29% of 527 segments painted.”

“Do you have any recommendations as to how I can increase that test coverage?  My father’s side is larger.  I’ve just sent my sister an Ancestry DNA kit.  Hope she might be a closer match to some of my more distant matches.  I’ll transfer her data to GEDmatch when it’s ready.  Another paternal 1st cousin has already tested on 23andMe and Ancestry, I’ve asked him to upload to GEDmatch.  One of my maternal 1st cousins is already on GEDmatch.  I’ve asked the rest of my 1st cousins who have not yet tested anywhere if they’d be willing to contribute to my ongoing research.  Unfortunately, one of those was adopted, so not a blood relative.  Another has had a bone marrow transplant and is not a good candidate for a DNA test.”

Once you’ve got all your relatives tested that you can get tested and you’re maxed out on how the % you have painted, then it might be worthwhile to study your DMT Combined Run for Person A (yourself) to see what else it tells you. Start to identify triangulation groups the way Jim Bartlett does on his segmentology.org website. 

Another thing you can try, is to go into your People file and look for people who match on 4 or more segments and look for people whose segments are all in the cluster they are put into or else are “U”. That may indicate (no guarantee, especially with endogamy) that they match Person A along that cluster’s ancestral line.  Then you can try adding MRCAs for them, but don’t include the “R” at the end, since you don’t know their exact MRCA with person A. e.g.  If they have 4 segment matches on FF, FF, FFM and FF, then you try assigning them an MRCA of FFM (not FFMR). DMT will use that MRCA as part of its attempt to initially assign each segment, but it will be overruled by any triangulations with ancestral paths from MRCAs ending in “R”. This may help fill in some unknown areas where there are no known relatives with segments.


“Is it feasible to work the other way around, say, using the DNApainter chromosome map to select an as yet unidentified segment on one side or the other?  Then enter that segment into the GEDmatch segment search tool, finding matches who have a shared match with me on that segment.  I could then examine trees for those matches, if any, to potentially find a common ancestor.  If no tree is available (or insufficient depth is provided), I could reach out to the match for additional detail.” 

You can do that to find people at GEDmatch who match you on a specific segment. The only problem is, you won’t know what ancestral path they match you through. Searching that person’s ancestral tree will only help you if they connect to Person A’s tree. Everything in DMT including the MRCAs are done from Person A’s point of view/


“I have begun building floating trees for many of my closer unconfirmed matches on Ancestry, where they have trees.  Also using public records searches to identify other matches who have relatively unique names.  I’m reaching out to my closest matches, introducing myself, listing my known immediate ancestors, describing our shared matches, offering to share trees, and asking of they’d be willing to upload to GEDmatch.  I’ll do the same on 23andMe, MyHeritage, FTDNA, and GEDmatch.  I’ve given up on LivingDNA.  Response is likely to be low, but I’m hoping to find connections to the poorly covered branches of my family.  I’ve already gotten a favorable response from one close match and have begun to analyze our connections.

As you are progressing through an analysis, at what point do you give up and determine that a match is likely due to endogamy alone and probably untraceable?” 

In my case, any match that goes further back than my furthest researched ancestors is probably untraceable, simply because there are no more records from the Russian Empire and Romania that will connect them.

With regards to endogamy, it is almost the same as pileups. Endogamy segments are from long ago that have been passed through many people on various lines. They do indeed triangulate because they are actual segments and each segment is passed down to you through one specific ancestral path. But that path is always too deep to trace and connect to.  They are no different than any match that goes further back than my further researched ancestors.


“What are your minimum criteria for considering a match in the first place.  I generally focus on matches with more than 90-100 cM shared DNA and largest segment of at least 15-20 cM.  However, I’ve had cases where ThruLines (on Ancestry) has found a legitimate match of as little as 8-9 cM, when there are well-documented trees on both sides.  So, I’ve tried to provide as much depth on my tree as possible.”

For the B people, I’ll start taking the people I know the MRCA for. For the rest, I’ll then want to take the highest matching ones in each cluster, and try to get 5 or 6 in each cluster so that I have maybe 50 Person B files. I’m not as concerned about the cM as much as trying to get coverage of the closest matches in each cluster.


“Does DMT make any adjustment for known pile-up regions or any other corrections for endogamous populations?”

No. A pile-up or endogamous region are true segments that are passed down an ancestral path. The only difference is they are too far back to find the connection. You can identify them easily in DMT when you see dozens or even hundreds of triangulations all lined up over a segment. These could overlap with perfectly valid segments of closer relatives who can be identified, so don’t ignore those regions. Just ignore the people you don’t recognize in those regions.


“Any further strategies/hints/suggestions for making optimal use of your wonderful tool, DMT?  I’m very happy with it, so far, and hope to learn quite a bit going forward.”

Check out the last page of the Help File that comes with DMT, titled:  “Ideas, Tips and Tricks”.

Your Family Statistics at MyHeritage - Fri, 28 Oct 2022

MyHeritage has just enhanced its Family Statistics feature. See their blog post from yesterday on this. Being a numbers guy, I really like this sort of information. It it interesting, provides great insights and can point to errors that need correction. I believe MyHeritage has one of the best sets of analytics of any program. In this post, I’ll show what they provide for my own family tree at MyHeritage.


My Family Tree at MyHeritage

Originally, I had just one big tree at MyHeritage. But as I approached 10,000 people in my tree, I decided to split it up. The reason is that MyHeritage’s relationship calculations only fully work for up to 10,000 people. The following paragraph is from a MyHeritage blog post from Sept 29, 2021:

image

So I looked at what parts of my tree would have very little overlap. I split out my nephew’s wife’s tree (534 people), my sister-in-law’s husband’s tree (158 people) and the tree I do for a friend of mine (213) into their own trees. The only duplication I have is just one person in my sister-in-law’s husband’s tree whose husband is in my tree.

That leaves me with 9,050 people in my own tree which includes both my and my wife’s families. It also includes my one place-to-place study of all the people who left Mezhirich, Russia in the early 1900’s to come to Winnipeg. My mother’s family and my sister’s mother-in-law’s family are part of this study.


Summary of My Tree

MyHeritage has a “Manage family trees” page that lists the 4 trees I manage. You get to this page by clicking on “Home” and then clicking on the “Family Trees 4” line.

image

The Manage family trees page shows this basic summary for my tree: (click on any image for a larger version)

image

This already gives interesting information. For the 9,050 people in my tree, there are 3,411 families (marriages/partners) and 2,328 unique surnames. I have 24,411 events (about 2.5 per person), 1,038 sources (each source might be cited multiple times) but only 190 notes (I’ve only recently started working on those).

Notice the words “Individuals:” and “Sources:” are in light blue. They are links. Clicking on them takes you to the MyHeritage Name Index and Source Index for your tree. You can also click on the name of the person in the earliest and most recent event to go to their profile page.


Family Statistics

Now click on the Home button and select “Family statistics UPDATED”:

image

If you have a big tree, you’re likely to see it tell you to wait a bit:

SNAGHTML47d70d

This does not seem to refresh on it’s own, but click one of the other menu items and you may find the report to be completed already.


The Overview Tab

image

The analysis of my tree is for 9,049 people but my tree has 9,050, so it must be leaving myself out of the analysis on this Tab.

My tree has 4,277 males, 3,960 females, and 812 of unknown gender. Jewish records tend to give a person’s father’s name, which likely explains why I have more males than females.

My tree has 5,648 living people and 3,401 deceased people. My tree is public, but MyHeritage privatizes the information for living people.

If you take the 9,050 people in my tree and subtract off the 2,968 single people and divide that by 2, you get 3,041 pairs of people that have relationships. However earlier, the statistics said there were 3,411 families. The difference is because of people who were married more than once. They’d count as just 1 married person, but as more than one family.

The “Male”, “Female” and “Unknown” under Gender and the “Living” and “Deceased” under Living vs Deceased are links. Clicking on them will take you to the Name Index that includes just that category. The relationships status items unfortunately are not links.

The top 15 surnames, and male and female first names shown at the bottom as a wordle is a nice feature. Hovering over each person with your mouse tells you how many of them there are. It says I have 89 Bronshteins, 187 Davids, and 64 Sarahs. Each of the names is a link that will bring up the Name Index showing just the people with that name. The Name Index includes 1 more than what hovering says so it appears you should add 1 to the value you see when hovering. The Name Index also shows that the “first names” count includes any uses of the name as a middle name.


The Relationships Tab **NEW**

This is the new section that MyHeritage just added. I love it.

image

Of the 9,050 people in my tree, 1,203 are blood relatives, including 50 who are my ancestors and 2 who are my descendants. 6,714 are related by marriage and 28 are related by adoption which gives a total of 7,946 people who are related in some way to me. That leaves over 1,104 people who are not related to me.

Each of those categories is clickable, again bringing up the Name Index with just those people.

The most useful is the “not related to you” category. Clicking on it and then selecting people at random from the 1,104 of them, I can see some of them are families that I have recorded in my tree because they were very possibly related (with the same surnames from the same town as my relatives) but I have not yet figured out how they were connected. Some others were families in my Mezhirich to Winnipeg study. But then some of the people I clicked on were islands of just one or a few people such as this island of 5:

image

They must have got disconnected from my tree during editing, and I will need to determine whether they need to be reconnected or deleted. I could not have easily found them without this category.

Next are some “step” statistics:

image

The first column shows the number of people 1 step away from me, 2 steps away, … all the way to 10+ steps away from me.

The second column shows the number of people at my generation, at each generation above me up to 7+, and at each generation below me.

The third column show the number of marriages between me and the people in my tree, up to 6 marriages.

Each of these categories is clickable bringing up the Name Index for these categories.

Now the next item on this Tab is my favorite. It shows the counts of my 1,231 blood relatives by how they are related to me.

image

And again, each box is clickable which will give you a Name Index for all the people with that blood relationship to you. This table is very good if you want to estimate how many people in your tree that you should share DNA with. Anyone who is 2nd cousin or closer, you will definitely share DNA with. 2C1R you should share with, but there is a slight chance you won’t. Any further and the probability that you share DNA with them reduces. For more about this, see Amy William’s article: How often do two relatives share DNA?


The Places Tab

The remaining tabs were available prior to the update. I might as well include them here for completeness.

The Places Tab shows 3 maps with the number of people born, died, and with residence facts in each country. Hovering over the country gives the number of people.

image

Unfortunately, these maps don’t have clickable links to the People Index.


The Ages Tab

This page is both interesting and useful.

image

Included are the Top 2 oldest and youngest living people and the Top 2 of those that lived the most and least. You can click on the “Top 10” at the bottom of each box to get more.

The oldest living people often may be people you just neglected to mark that they are deceased. My two oldest living people shown here are not living. There is a “Mark as deceased” link below their name that gives a very convenient way of correcting their status. I’ll fix them and any others once I’ve finished this post.

Other Tabs

The Births tab shows the number of people born by birth month, by zodiac sign, and by decade. I don’t think many people will make use of the zodiac sign. It would be nice if they had a list of people born on my birthday or within a few days of it.

image

The Marriages tab shows marriages and who has been married the most. There is a Top 10 link you can click on for the most marriages.

The oldest and youngest when married, longest marriages, husband much older, and wife much older categories could indicate you’ve done something incorrectly. You can see I’ve obviously made a few mistakes. It’s worthwhile clicking on the Top 10 or Top 3 to find other problems. Then rerun the statistics and check again.

image

You can also find and fix lots of problems on the Children tab. Oldest and Youngest people having children, and largest age differences are very useful. The smallest age difference is great to find twins. It would be nice if they had more than just the Top 3 of the smallest age differences since differences of between 2 days and 7 months probably indicate an error of some sort.

image

There is also a Divorces tab. I won’t show that one, but it has the number of divorces by times divorced, who divorced the most, the longest marriage ending in divorce, age when divorced, and the oldest and youngest divorcees.


Family Tree Builder

MyHeritage’s Family Tree Builder desktop software, that syncs smoothly with your online tree at MyHeritage, does not have its own analytics. It does have a “Family site” tab designed to open up a few different reports from the online site. However, the “Family Statistics” selection does not appear to be working as I write this. It likely should bring in exactly what I’ve shown above from the MyHeritage website.

image


Finding Errors in Your Tree

The Family Statistics isn’t the only way or even the best way to check your tree at MyHeritage. For a more thorough check, you can use MyHeritage’s Consistency Checker which is available on the website from the “Family Tree” menu item.

image_thumb5

From within Family Tree Builder on your desktop, you can select from the Tools menu: “Tree Consistence Checker” as well as two additional useful checks: “Check for deceased people” and “Check for duplicates”.

image

The Consistency Checker online gives me 492 issues, and the CC in Family Tree Builder gives me 561 issues. That seems to indicate that they are not doing exactly the same checks so it is worthwhile to check with both of them.


Conclusion

MyHeritage’s Family Statistics has some very interesting information in it. Some of it will identify errors you have in your tree, so it is very worthwhile to go through each of the statistics pages and see if it all makes sense. Even if your tree is already perfect as I’m sure it is, you’ll likely find a few interesting tidbits of information to tell to your family.

Now I’d better run off and fix all the likely errors that the Family Statistics and Consistency Checkers found for me.

Catching Up - Sat, 8 Oct 2022

I was shocked to notice that I hadn’t blogged in over 3 months. I think it’s time to catch up with what’s been going on.


The Last 3 Months

To be honest, I’ve had another non-genealogy project distracting me and taking some of my genealogy/programming time away. Also, we had a beautiful summer here in Winnipeg, and I’ve been doing lots of bike riding and swimming. It’s much harder to be indoors on my computer for programming when its so nice outside.


Double Match Triangulator

I’ve actually released 6 minor versions (5.0.1 to 5.0.6) since June. Most have had to deal with changes at GEDmatch. They seem to have been making a lot of improvements there and their changes can break DMT. I always try to address anything no longer working as soon as I find out about them.

Meanwhile, I have been involved for over a year on a DNA project with my wife’s 3rd cousin. He had close to 60 descendants of my wife’s great-grandfather and five suspected siblings, and we are trying in spite of endogamy to determine whether they are indeed siblings. I have been trying to to use DMT to help with the analysis and the data has been an excellent test bed for DMT. It has led to improvements included in Version 5.0 that was released in May.

I still have some work to do on that study, and that may test DMT some more.


Behold

During the Spring, I was able to spend a couple of months working on Behold, trying to finish up Version 1.3 and get the Everything Report to be how I wanted it. But then the DMT work and the summer got in my way.

For the longest time, I’ve always expected Behold would become the genealogy editor that I would use for recording my genealogy. But I never got past it being just a genealogy data viewer. A few years back, In 2018, I decided I couldn’t wait for myself and Behold any more and I needed an editor for my genealogy. I decided to use MyHeritage online and its Family Tree Builder software on my computer. I am very happy that I did.That’s because there’s nothing like the smart matches that do a lot of research for you, pointing at records that more often than not are correct, and connecting you with other researchers and their trees.

As a result, my ideas for Behold have changed. I now have all my family tree data up on MyHeritage, and some of my data up on Ancestry, FamilySearch, WIkiTree, Geni, Geneanet and GenealogyOnline. Of those, FamilySearch, WikiTree and Geni are one-world trees, meaning other people are editing and updating my family there as well.

What I need now, rather than an editor, is a tool to help me evaluate and compare the information I have on MyHeritage to the information that’s on the other trees. I need to be able to find out what’s changed from my own information so that I can determine what needs to be updated. And I’d like some help to keep the thousands of profiles I have on the various systems in sync along with the photos and records attached to them.

There used to be a program called AncestorSync. It was designed to keep all your trees in sync on the various systems. It would have been perfect. But unfortunately, the team stopped developing the program.

I have no desire to try to get Behold to actually Sync your data. That would be a lot of work as I’d have to make agreements with the various tree operators to be allowed to update to their systems, and I’d have to be meticulous to learn to use their APIs (Application Program Interface) to make the update without mistakes. That’s a bit more than I want to take on.

Instead, I could do the next best thing: Set up Behold to help you manually update another tree with your data from a different tree. I know I myself want that function, and I would think that a lot of other people might find that very useful as well. We’ll see how it goes.


Conferencing and Social Media

After over 2 1/2 years of our Covid world, in-person genealogy conferences are finally starting up again. During the 2010’s, I attended and gave talks at 11 International Genealogical Conferences, 6 in the United States, 3 in Canada, 2 on the high seas around Australia and New Zealand, and 1 in The Netherlands.

2020 changed things and the world started going virtual. Now we’re inundated with a cornucopia of genealogical webinars available to us while sitting at home in our pajamas on our computer. I know I have watched hundreds of genealogy presentations in the past 30 months through providers such as Legacy Family Tree Webinars, the Virtual Genealogical Association, RootsTech, Dear Myrtle, Family History Fanatics, WikiTree, GeneaBlogger, Ed Thompson, the Association of Professional Genealogists, and presentations by the FGS and local genealogy societies – not just where I live but from all over the world. I have also been a speaker at several of these online conferences.

And last year I took an excellent online SLIG (Salt Lake Institute of Genealogy) course to help me with my Russian research.

I have got to the point where my head is 98% full and I’ll be starting to be more selective of what webinars I watch and conferences I go to. There will have to be something quite new and appealing to attract me now.

Social media is sort of the same thing. Genealogy groups on Facebook exploded in popularity in the past 5 years. I participated quite a bit at the beginning, but they have become less useful to me since then.


DNA

I’ve done just about everything I can with my own DNA. I’ve tested at all the sites including Big-Y and mtDNA at Family Tree DNA. I’ve used all the tools at all the sites as well as many 3rd party DNA tools. Endogamy and ancestry that only goes back only to the early 1800’s (Romania and Russian Empire) has limited my ability to connect to many of my DNA relatives. I have even taken a few WGS (Whole Genome Sequencing) tests to see what they could do for me.


Looking to the Future

My genealogy has taken great steps in the past couple of years, due to the help of MyHeritage hints and also hundreds Russian and Romanian records of my family found for me by some excellent researchers. I’ve got to get this all organized and fully documented.

I’ll be continuing to work on Behold to help me with this, and will update DMT as needed. I’ve got my GenSoftReviews site that I’ll keep maintaining as well as my personal lkessler.com website. Maintaining websites, by the way, does take a fair bit of time. Especially when an issue like updating a PHP version takes hold.

I’m still very excited about the future, because we never know what’s in store.