Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Double Match Triangulator 2.9.4 - Fri, 21 Dec 2018

I’ve been working on DMT version 3.0 for the past few months and was hoping not to have to release anything before it’s ready.

Unfortunately, a few file format changes at some of the companies have caused DMT 2.1.1 to no longer recognize their files.

In October, I noted that Family Tree DNA made significant changes to the format of their Chromosome Browser Results file. But DMT was still able to handle it … until a month later when they made an additional change and added a space after the comma before the chromosome number. That broke DMT:

image

Also it was reported to me that MyHeritage Shared Segment files weren’t being recognized by DMT when including them as Folder B files.  As it turns out, that was such a dumb error of mine. In DMT, it was looking for “segements” in the file name instead of “segments”. If you changed, the file names to include the extra “e”, then they’d be recognized, but DMT wouldn’t read them because at reading time DMT was looking for “segments”. Catch-22.

The coup de gras was just this week when DMTs GEDmatch button no longer was able to handle the Segment Search report in GEDmatch Genesis. It turns out they added a checkbox to that report:

image

So at this point, DMT was having trouble with FTDNA, MyHeritage and GEDmatch files. I had created private versions (2.9.1 to 2.9.3) and sent them to people who reported these problems to me. But there was now enough needing fixing that an intermediate version for everyone was necessary. Thus, 2.9.4 is now available from the DMT homepage.

2.9.4 includes a few other fixes as well. But if you have not encountered problems with DMT, have not downloaded new FTDNA files, don’t use MyHeritage files or Tier 1 of GEDmatch Genesis, then there’s no need to upgrade and you can wait until 3.0 is released.

My Living DNA Matches - Thu, 22 Nov 2018

During the summer, I took a DNA test with @Living_DNA. In addition to testing, I also took advantage of their free uploads, and uploaded both my 23andMe raw data, and my Family Tree DNA raw data.

I already posted about my results from my test, but at that time, their “Family Networks” which is what they call their list of people that you match to, was not available. Living DNA has become the 7th resource to offer genealogists access to the list of people they match to. 

About a month ago, on October 26, I received an email from Living DNA that my beta family matching results were available. I took a look. I believe I had 10 matches at the time, none of whom were very close, said to be 4th cousins or further, and none of whom I recognized.

Today, I received another email stating that I was in the “next group of users to gain access to our Family Networks beta.” I took a look and now, not only does my test result has matches, but one of my uploads, my 23andMe upload, also has matches.

I thought I’d take a more detailed looks and record my observations. This is what my matches now look like:

image

I now have 30 matches listed. None of them are known relatives. I believe I have seen a few of them in my match lists from other DNA testing companies, so that is good verification.

The first match is with myself, because it is picking up my 23andMe uploaded kit. It says that I share 3572.89 cM or 98.54% with myself, which tells me they are using 3626 cM as their 100% basis. They say my predicted relationship with myself is “Identical twin” and if I click the “i” info button next to that, I get a cute description that says maybe it is another test, or else I found my clone.

image

The other matches are all either predicted 4th cousins or predicted 5th cousins. The closest I match is 97.43 cM (2.69%).  The info box next to the cM value describes what a centimorgan is, but does not give any additional information such as number of matching segments or largest segment.

My 30th match is 63.4 cM (1.75%). All 30 matches are on the one page. There is no pagination. The “Export Results to CSV” button currently pops up a message that says: “Coming soon.” 

The “Message” button next to each match also tells you that it is coming soon.

Clicking on the “View Profile” button of my closest match (other than myself), gives mock ups of what will be a Map of my matches, a Chromosome Viewer and the Messages area, all of which are marked as coming soon. Currently there is no additional information about the match. Only the match’s name is given and nothing more.

But the profile does give something very useful now, and that is the shared matches. For my first match, it lists 23 shared matches. And each can be clicked on to bring up their profile. Unfortunately, they list your shared matches with the furthest one first to the closest one last. I’m sure Living DNA will see this and correct the order before too long.

But what is interesting is that the lowest 12 shared matches were between 39.19 cM and 60.69 cM. The other 11 shared matches were higher than 63.4 cM and were on my own match list, but these 12 were not. 

So what happens when I take a look at my shared matches with my upload from 23andMe? It gives me a list of 58 shared matches, all on one page, that start at 38.43 cM and go up to 97.43 cM (since it is lowest to highest). This gives me 29 additional matches that were not among the 29 on my own match list.

For fun, I also compared my matches from my DNA test versus my matches from my 23andMe upload. Of the 30 matches each showed, 27 were in common. 6 had the same cM. 14 of my test matches had larger cM and 7 had lower cM than my upload. The average difference was 3.3 cM and the largest difference was 11.68 cM (72.71 cM in my test and 84.39 cM in my 23andMe upload).

Living DNA has a few details to fix, pagination to add, and features to finish. But the framework looks very good and it’s nice to see their matches working.

Now it’s just a matter of waiting until Living DNA loads all their matches for everyone so that they give thousands of matches, like the other companies do.

For comparison, here are my current match statistics at the various companies:

image

The Database sizes are from Leah Larkin. It is interesting that AncestryDNA has so few people matching me with at least 50 cM, likely due to their Timber algorithm that filters out many matching segments. At the opposite end of the scale is Family Tree DNA that includes segments down to 1 cM in their total count.

I must say that I have found more traceable cousins among the 49 matches above 50 cM at Ancestry DNA than I have at any of the other companies. The next best company for determinable matches for me was 23andMe.

The above table gives you the 7 DNA pools where you can now get matches. Remember, you can only catch your DNA relative if they’re swimming in one of your pools.




Update: Dec 11, 2018:  Leah Larkin posted a quick poll on Facebook asking people how many matches they have on Living DNA. I checked at Living DNA and entered my current number of matches which is 65 and that exclude my match to myself. 

Over 70% of the people had zero matches. About 20% only had 1 match. The other 10% all had up to 10 matches. I am the extreme outlier with 65 matches.

I do notice that Living DNA has added pagination with 10 matches shown per page. They also seem to have fixed the problem of additional matches that I earlier was able to find through shared matches. Now all the matches I have seem to appear among the 65.

The New Chromosome Browser Results File at FTDNA - Sun, 21 Oct 2018

A few days ago, Family Tree DNA released a new version of its chromosome browser. There are other people that have already described the improvements, including Kitty Cooper and Roberta Estes, but I’d like to focus on one item that’s changed that affects Double Match Triangulator users.


New Link to Download Segment Matches

The segment matches that you download for use in Double Match Triangulator are downloaded into a file that I’ll call the Chromosome Browser Results (CBR) file, since it is named: 

nnnnnn_Chromosome_Browser_Results_yyyymmdd.csv

where:

  • nnnnnn is your Family Tree DNA kit number,
  • yyyymmdd is the date of your download, and
  • .csv indicates this is a comma delimited file which can be read by Excel and other programs.

It contains a header line and one line for each segment match that you have with every person who you match to.

The way you download the file has changed. Previously, you used to go to the Chromosome Browser page and click on the “Download All Matches to Excel (CSV Format)” link that I’ve shown below highlighted in orange:

Now, you get it a different way. You must first go to your home screen and click on the Chromosome Browser box:

image

That will take you to the new Chromosome Browser tool page. Before you select anybody, scroll down to the bottom of your list of DNA matches, and you’ll see a “DOWNLOAD ALL MATCHES” link.

image

Click on that download link, and it will start to download your segment matches.

As before, you still have to be patient after you click the link. There is no immediate indication that anything is happening. I have a lot of matches (17,462 people with whom I have 347,193 segment matches) and I find it takes about 40 seconds for anything at all to happen and then a window pops up to ask what I want to do with the file, that it says is 19.9 MB in size.


File Format Changes

There are a number of changes to the file itself as well.

1. The file now has a Byte Order Mark (BOM).

The BOM is a few characters at the beginning of a file to tell programs that read the file what type of character set it has. The BOM that FTDNA added says that this is a UTF8 file meaning it can contain any Unicode character. The BOM is somewhat of a technical detail you don’t need to worry about, but what this BOM indicates is that the file may contain names and/or words written in almost any language. They sort alphabetically after English letters, so you’ll find names starting with non-English letters at the end of the CBR file.

image

If you have downloads of the CBR file prior to this, they did include Unicode text, but without a BOM or a program knowing this, the foreign letters would only appear to be gobbledygook:

  image

Now that the BOM has been added, text programs, Excel, and DMT all read and display the names correctly. I’m currently working to release version 3.0 of DMT, and I’ll make sure it will also read the names correctly as Unicode from older files you may have downloaded before FTDNA included the BOM.

2. They changed the file format.

The lines used to end with a Carriage Return and Line Feed (CRLF) which is the Windows file format standard. Now they end with just a Line Feed (LF) which is the Unix file format standard. I can’t imagine why they might have wanted to change this.

3. They removed the double quotes from the text fields.

They used to have the name of the person and the name of the match always in double quotes:

image

They’ve removed them and it now looks like this:

image

Either way is fine for csv (comma delimited files), but a program will need to be able to handle it both ways and give the same results. 

One case that causes problems is a name with double quotes in it, e.g. “Buddy” John Williams.  They had previously included this as:  “”Buddy” John Williams”.

If you load the former into Excel, it will display as Buddy John Williams without the quotes around the Buddy and it will no longer match the “Buddy” John Williams that the previous form gave you.

This change caused a very strange bug in DMT that took me two mornings of debugging and over 100 compiles before I solved it. I have to trace the problem step by step to discover that it was caused by the quoting.

4. They changed the header line.

It used to be:  

image

Now it’s:

image

Well all they really changed was from using Upper case to using Mixed case for the field names. That might not seem like much, but DMT used the first line to checking to see if you have a FTDNA segment file. It would be easy enough to simply uppercase all the letters and compare those … but then, they did add a space in “Match Name” in the new file as well.


The Effect of These Changes

The difficulty with writing programs that read in files produced by anyone else is that  the file format can be changed at any time and break a developer’s utility program. For a desktop program like DMT, you then have to wait for its next release and hope that the programmer noticed the changes and the program has been updated to handle them. (p.s. Thank you to all of you who report problems to me. If I don’t know about them, I can’t fix them.)

With regards to handling file format changes, I envy those who write online programs, because they can squeeze in a fix online at any time. Changing a packaged program like DMT is a bit more involved.

All utility programs are subject to the whims of the programmers and web developers at FTDNA, GEDmatch, 23andMe, MyHeritage DNA and AncestryDNA. Any time they make a change, they affect the utility programs that use their data.

I’m still working on version 3.0 of Double Match Triangulator. Corrections for these FTDNA file format changes will be included, which should allow files in the new format to be compatible and work with files of the old format.




Update: Nov 5, 2018:  Since I wrote this article 3 weeks ago, Family Tree DNA made one other tiny change to its Chromosome Browser Results file. They added a space after the comma but before the chromosome number on each line. This breaks Double Match Triangulator.


Update: Nov 28, 2018:  Here is a temporary fix to get the new format CBR files to work with DMT version 2.1.1. There are two possible methods:

1. If you have a text processor, open the CBR file with the text processor and change all comma spaces to just a comma, i.e. change all “, “ to “,”.  Then save the file.

2. If you don’t have a text processor, you open your file with Excel.  Select Column C (Chromosome) and open the “Find and Replace” box and change space-X to X, i.e. change “ X” to “X”. Then save the files as CSV UTF-8 (Comma delimited) (*.csv). This will remove the extra spaces that FTDNA has added and will convert their file back to standard .csv format.

image   

Fixes to handle FTDNA’s new format have been developed and will be in version 3.0 of DMT when it is released.


Update: Dec 10, 2018:  Less than 2 months after Family Tree DNA changed their Chromosome Browser Results file and the method to download it, they’ve managed to change the method to download it again.

Now, on their Chromosome Browser page, you click on the “Download All Segments” link that’s at the top right of the DNA Matches listing. You still have to do this before you select matches to compare.

Then you wait a while. It took about 30 seconds for me before anything happened and my browser said the file to download was available.

image

This is actually a good change. Where they first had it at the bottom of the page was not easy to find. Now it is readily accessible.

Since it takes a while to respond, they really should pop up a box right away saying: “Please wait while the download is assembled”. But unfortunately they don’t, so most people will press it 100 times to get it to work, when really only one time will do.

The instructions shown in the Update Nov 28, 2018 still must be followed to put the new file into a format that DMT can read it until Version 3.0 is released.




Update: Jan 8, 2019:  Version 2.9.5 of Double Match Triangulator has been released to handle the new Family Tree DNA format.