Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Behold, My Genealogy, and Syncing - Sun, 25 Feb 2024

Over the past several months, I’ve been back to work on the next version of Behold. I’m hoping to release the next major version in the next …  - okay, a programmer knows better than to promise a release date, but let’s say as soon as it’s ready. Keep an eye on Behold’s Future page to follow my progress.

The last major release of Behold was Version 1.2.1 which I released in March 2016. Since then, I’ve released 6 additional point versions made up mostly of fixes and small improvements, with the last point release being Version 1.2.7 in September 2021.

So it’s been almost 8 years since the last major release of Behold. What have I been up to?


What Have I Been Up To?

Two things really.

The first thing that caught me was DNA. It was just after my 2016 Unlock The Past Genealogy cruise that I submitted my uncle’s and then my DNA to Family Tree DNA for testing. At that point it was DNA or Bust and I submitted my DNA everywhere, learned everything I could about genetic genealogy, and in 2017 created my Double Match Triangulator program which placed 3rd in the Roots Tech 2017 Innovator Showdown. I spent a lot of time over the past 8 years developing DMT and getting every last drop of genealogical worth out of my tests. I’ve written a lot of technical blog posts about my DNA analysis over this time. And that journey has now run its course.

The other thing that slowed me down is my own genealogy. That’s a very good thing! It was 2016 when I headed into my retirement from my 40 year career at Manitoba Hydro. Up to that point, my genealogy effectively lay dormant in dozens of binders, files and boxes. This was material I collected over the years with the intention of going through and putting together once I retired. And I’ve been doing that.

As far as my actual family tree itself, I hadn’t updated it since 1994 when I was using Reunion for Windows. Leister sold their Windows program to Sierra Online who were redeveloping it as Generations, but it was then purchased by Genealogy.com and dropped to eliminate the competition for their program Family Tree Maker. I still have the last GEDCOM file I exported from Reunion called KESS9407.GED which had the 1,361 known relatives from my and my wife’s families.

I didn’t purchase another genealogy program after that. Instead, I started developing Behold in my spare time on evenings and weekends since I was then working full time. The intent was that it would be the genealogy editor I wanted for myself to replace Generations. I purchased a Rich Text editing package called TRichView to handle the display and editing. It works just like Word as a WSYIWYG (What You See Is What You Get) editor, and the goal was to turn Behold into what would still be the only genealogy WYSIWYG editor.

I released the first alpha version of Behold in 2005, and Version 1.0 was out in 2011. So far it only was a GEDCOM reader, but I still desired it to be an editor.

Then a big change for genealogists happened. Companies like Ancestry and MyHeritage were offering online family tree programs with a bonus: billions of records with automated searches that provide you with relevant hints. That changed everything! In 2017, I attended the 13th International Genealogy Conference in Houston sponsored by Family Tree DNA. MyHeritage was there and offered a great lifetime half-price offer to all attendees on their complete package, and I bit. All of a sudden, MyHeritage’s Record Matches with their billions of records and Smart Matches with their millions of family trees were what was important. And their online editor was convenient and good enough, along with their free downloadable Family Tree Builder software that could sync with your online tree.

About the same time, I got lucky. Starting in 2017, records from my ancestors towns in what is now Ukraine and Romania started to become available. I acquired over 400 birth, marriage, death and census records from 4 different researchers and added 3 generations back to the early 1800s for most of my lines. MyHeritage and its record collections and family trees revolutionized my task of finding decendants of my newly discovered European family, sending me the likely matches to review. From the 1,361 known family members I had in 1994, my family tree has grown to be 10,800 today, which does include several thousand people in an important place to place study I have been working on.

image

I’m now sitting in a really good position. I’m well into digitizing my binders, files and boxes. Every day I check for new Record Matches and Smart Matches at MyHeritage and process them and their implications to my tree right away and research any additional hints they provide. The majority of my family tree information is now sourced. In 2016, I never thought I would get to this point.


So What’s Important Now?

Over the past 8 years, the information has just poured in. The tap is starting to run dry. I’m no longer expecting a lot of new information. Records only started in Eastern Europe in the early 1800s, so I won’t be able to go any further back. My family tree has matured and it’s a now a matter of ensuring quality and keeping up with any new records that come along.

What’s missing from this equation is to ensure the preservation of the data I have collected and to make it widely available so that others who connect with me won’t have to work to put it together it like I did. That would mean sharing it on other family tree sites such as Ancestry, FamilySearch, WikiTree, Geni, Geneanet, Genealogyonline.

I have an account on Ancestry, but I only have a small tree there. I have not used Ancestry’s hint system yet, since I’ve been concentrating on MyHeritage, but it would be valuable to do so. The key would be to set up a full tree there by downloading from MyHeritage and then uploading to Ancestry. Then Ancestry’s hints and family trees can work and do their magic and maybe fill in a few more boxes.

But once that initial tree is up, I can’t do it again. After I process the hints, a new upload will likely recreate all the old hints. So I’ll need to keep them synced somehow. There are two programs that claim to sync with Ancestry. One is Family Tree Maker and the other is RootsMagic.  I’ll have to experiment with both and see if a reinfusion of a new GEDCOM into either FTM or RM will continue to sync with Ancestry, or if it will break the linkage. If the linkage can be maintained, maybe I can then follow this procedure:

MyHeritage –> GEDCOM –> FTM or RM

FTM or RM -> Ancestry

Ancestry Hints –> MyHeritage

FamilySearch is also an important tree to have information at. I uploaded about 1,500 deceased family members via a GEDCOM a few years ago. RootsMagic and Ancestral Quest both sync with FamilySearch. I also understand that MyHeritage provides syncing with FamilySearch as well, but currently only for members of the Church of Latter-day Saints. Hopefully they eliminate that restriction in the future. Even so, you have to be careful because other people edit FamilySearch. You wouldn’t want to copy any unverified information from FamilySearch back into your own tree.

WikiTree is a One World tree. It is different because it stresses biographies with human input. This is valuable but requires a lot of manual labour to maintain. I was fortunate to have been a WikiTree Challenge guest and had my tree worked on, and I added to that later by being a participant in two of their Connect-a-Thon events. They have some great tools including a WikiTree Sourcer Browser extension which could pull a person’s facts from a FamilySearch or Ancestry page and then enter it for you on a new WikiTree person page, saving you a whole lot of typing and manual effort. Great tool! There is no program which will automatically sync with WikiTree for you. However, Behold does a pretty good job of displaying your WikiTree data that you’ve downloaded to a GEDCOM file.

Geni is another One World tree now owned by MyHeritage. It would be nice if MyHeritage could figure out some way of syncing data between Geni and MyHeritage. Geni does get hints from MyHeritage, and MyHeritage does give Record Matches with Geni profiles.

I’ve uploaded an extract of my family tree via GEDCOM to Geneanet and genealogyonline a few years ago, but I really haven’t worked enough with either of them to figure out how to best make use of their sites.


And What’s Needed Now?

We each have our one primary place where we maintain our family tree. It may be a desktop program, or an online tree, or a desktop program synced with an online tree.

What we need are programs to sync and/or make it easier to permeate our  information everywhere else. Nobody wants to have to retype everything a dozen times.

I no longer need to convert Behold into a genealogy editor. MyHeritage for me is good enough for that.

But I do like the assistance Behold already provides to be able to easily see what data I’ve got at MyHeritage and at all the other sites.

I want Behold to do a bit more. I have some ideas and I’m working on it.

Stay tuned.

Can Artificial Intelligence Read Russian Handwriting? - Wed, 7 Feb 2024

There’s been a lot of talk the last year or so about the use of Artificial Intelligence for Genealogy. I’ve basically taken a laissez faire wait-and-see attitude towards it. Most of the applications of AI for genealogy are designed to save you time, maybe by drafting out a biography for you or doing image creation, repair or animation.

But I’m looking for something that can help me, and help me specifically with regards to one particular task. The task I’m interested in is reading handwriting – not just any handwriting, but the handwriting in Birth, Marriage, Death and Census records from the Russian Empire.


Transkribus - AI to Read Handwriting

I was made aware by Jarrett Ross’s post on Twitter a week ago of an online program called Transkribus.

Transkribus describes itself as:

“an AI-powered platform for text recognition, transcription and searching of historical documents – from any place, any time, and in any language.”

You upload your handwritten document. You select one of their public models for different languages and time periods. If the public models don’t serve your needs, you can train a custom model. They supply an introductory video on Getting Started with Transkribus.


Artificial Intelligence Model Types

Reading handwriting is a difficult problem, but is something that Artificial Intelligence should one day be able to handle.

I classify AI as one of two types:

  1. Expert systems
  2. Self-training models

An expert system is one where you as a human tell a program exactly how to do every step of a process. The program does not learn anything on its own, but the result can seem to be very intelligent and be completed faster and more accurately than any human can.

A self-training model is one where you give a general AI program lots of different problems to be solved along with the answers to each problem. You let the program itself work out how best to generalize the problem and produce a solution for it.

AI can be expert systems, self-training models, or a combination of the two.

An example is a chess program. The first programs were all expert systems. All the rules were written by the programmer. In 1997, Deep Blue became the first chess program to beat the world champion who was Kasparov at the time. This program was an expert system, but with hardware that made calculations very fast.

However, self training systems can do better. In 2016, a chess program called AlphaZero was developed that was trained solely via self-play for just 9 hours. It then defeated Stockfish, which was at the time the strongest chess program, and it won with an amazing score of 28 wins, 72 draws, and zero losses.


What is Involved in Reading Handwriting?

The goal with handwriting recognition is simply to transcribe the handwriting into text. No translation is required. We are just looking to have each handwritten letter, number or symbol converted to the correct text. And with as few mistakes as possible.

There are already many good translation tools available (e.g. Google translate) so if the handwriting is in a foreign language, the transcription should correctly translate. The program to read the handwriting and create a transcript need not translate it or understand what the words mean, but it will do better if it understands the language to know that this “i” must be an “e” since there is no such word otherwise.

Obviously, this is not a job for an expert system. Nobody can effectively describe the rules they use in their head to read handwriting. So we must use a self-training model.

Generally, if you have a 100 pages of handwritten English text all written by one person and the typewritten equivalents, a good self-training AI should be able to train itself to read that particular person’s handwriting.

And if you get 20 different people to write the same text, then the AI should be able to do a good job of generalizing its model to read not just those 20 different people’s writing, but almost anybody’s, except your doctor’s. (For your doctor, you’ll still need to get your pharmacist to read it.)


Trying an English Document

Well, let’s see how well Transkribus does. I took part of a letter from my great-grandfather’s homestead application in 1906. (Click on image to get full sized):

2024-02-07_12-55-56

I selected “English Handwritten” and used it’s default AI model “The English Eagle” and in only about 30 seconds, it gave me this:

image

I included red underlines for Microsoft suggested incorrect spellings. I’d say Transkribus did an excellent job, and when comparing even the red underlined words, you’d have to say Transkribus did usually produce what seems to be handwritten.

It had trouble with the edits on the page, and interpreted the “it is not” inserted at an angle at the left as “Goedener”. It missed the inserted “the” in “spirit of the law” in the line before. And the most important word (my great-grandfather’s surname “Focshaner”) was inserted with a caret in “as if he ^ were in any way”, but the surname was missed and the “he were” became “herwere”.

So that’s how a page of handwritten English text can get transcribed. It did a good job on a good quality document with relatively neat handwriting. You could do as good a job yourself if you are able to read handwriting, and you could then use Transkribus to help you decide on the words that are more difficult. I am somewhat impressed.


Trying an English Genealogy Document

But we’re genealogists. Our documents to interpret are not as simple as a well-written page of text. Our documents are mostly forms and we need help getting names, places, dates and notes from them.

Let’s try this Homestead Inspector’s Report, also for my great-grandfather. This is more typical of one of the “good quality” documents a genealogist deals with:

2024-02-07_13-25-11

The option selected again was “English Handwritten”. Supposedly only the handwriting was to be interpreted. But it gave me this:

image

I’ll let you compare for yourself, but I was quite disappointed with these results. They are just a bit too far away from correct to be useful.

Transkribus may have other English models that might do a better job, or you can train one yourself. I think this result reflects my current impression of how much further AI has to go with regards to reading handwriting. But it’s a start.

I don’t need a program to read English handwriting for me. For the few documents I have, I am able to do it quite well myself because I understand English and know how to handwrite in English and read English handwriting.


Any Chance At All for Russian?

All 9 of my and my wife’s grandparents (the extra is my father’s stepfather) came to Canada in the early 1900’s, two from Romania, and seven from what was the Russian Empire and now is Ukraine. All of their birth documents and their ancestors and family documents are written in Romanian or Russian.

:Let me concentrate on the Russian documents. These are all from 1910 or earlier and mostly include Birth, Marriage, Death and Revision List (i.e. Census) records and all the text is handwritten onto forms. Just over 2 years ago, I took a wonderful Salt Lake Institute of Genealogy (SLIG) Course on Researching Russian Genealogy Records, which made me do the valuable task of learning the Russian alphabet as a prerequisite.

Theoretically even though the alphabet is Cyrillic rather than Latin characters, an AI program trained on these documents should do just as well converting handwriting to text whether in Russian or in English. The quality of the handwriting would be the biggest consideration in any language.

Melanie McComb pointed me to an article that lists 3 public AI models for Russian Handwriting that could be used with Transkribus. The one called “Russian Handwriting Early 20th Century” seems most appropriate for my documents since the Russian alphabet had extra letters and the language was somewhat more complex before the Russian Revolution.

Well lets go all in and try it.

Here, for example, from JewishGen is the marriage record of my wife’s great-grandparents Moshko Furman and Charna Rushaylo in Zhitomir in 1886.

2024-02-07_14-17-15

To be honest, I don’t give the AI much hope.

Even so, I go over to the Russian Handwriting early 20th century model page, and I upload my document.

Well it did give me something, actually more than I expected. And when I throw this into Google Translate, I get:

image

Unfortunately, there isn’t much in the translation that’s recognizable.

The names of the bride and groom that are at the left of the record weren’t even interpreted, probably because they were heavily underlined in ink, something done a lot in Russian records. Those may have obscured the names from Transkribus.

Also, old Russian handwriting tended to split words in two at the end of a line without a hyphen or any indication that the word is split. That really does a number on Google Translate’s results.

Here’s how JewishGen indexes the record:

image

If I take the text of the comments:

Groom - townsman from Lipkany, Khotinskij uezd; 1st marriage. Bride - townswoman from Chudnov, Zhitomirskij uezd; maiden (1st marriage).  

and I use Google Translate to convert it to Russian:

Жених – горожанин из Липкан Хотинского уезда; 1-й
брак. Невеста — горожанка из Чуднова Житомирского уезда; девица (1-й брак).

And then I change the Russian type font to a Russian handwriting font:

image

And then I throw that text back into the Russian model of Transkript, I get … unfortunately this:

image

I’m very surprised. That’s just about the best-written Russian handwriting you’ll ever find. I tried it on the other Russian models on it as well, and no-go.


Conclusion

It’s going to be a while yet before any AI tools will be able to interpret handwritten genealogy documents for us, especially those from before the 19th century in the Cyrillic alphabet.

For now, we’ll have to continue to rely on our foreign-language researchers who have spent years reading those documents, and can use their experience to understand them and to even find them for us in the first place.

Eventually, an AI model might be able to be trained for a particular type of document, such as the Russian marriage document I tried above. But it will take someone with the expertise, time and patience to do it.


Followup March 3:

I had two suggestions on Twitter with regards to this article:

  1. Try Ocelus by Teklia.
    It can accept Russian handwriting and output the corresponding Russian letters. But for my test documents, when the output is copied to Google translate, not enough words are correct to be of use.
  2. Try Yandex by Iron Hive, a Serbian company. This is a set of tools designed to help with Russian documents. There is a Yandex Vision OCR tool that includes support for Russian and English handwriting recognition. But this seems to be a paid service for programmers and I don’t see a simple way to try it with an uploaded document.

Continuing Education 2023 - Thu, 28 Dec 2023

Last January, the Association of Professional Genealogists  @APGgenealogy started requiring that members report at least 12 hours of Continuing Education each year. I found the task of listing my CE time for 2022 quite interesting and last January I posted what I had done.

Below is my Continuing Education activity list for 2023. Each event was 1 hour unless otherwise noted.

Webinars – Total 25.5 hours

  • Jan 4 – The 5 steps to organizing your DNA in 2023 – Diahan Southard
  • Jan 19 – The Basics of Jewish American Genealogy, Rhonda McClure
  • Feb 11 – 10 Tips of Successful Online/Onsite Research in Ukraine, Russia and Belarus – Alina Khuda, Virtual Genealogical Association
  • Mar 14 – FamilySearch GEDCOM Technical Q&A – Gordon Clarke
  • Mar 14 – RootsTech Recap – Daniel Horowitz, MyHeritage
  • Mar 28 – New Developments of MyHeritage DNA by Gal Zrihen, MyHeritage
  • Mar 29 – Predicting Unknown Close DNA Relationships Just Got Better! Segcm Tool – Andy Lee, Family History Fanatics
  • Mar 29 – The Alex Krakovsky Project – Navigating the Wiki to Locate Town Records, JewishGen
  • Apr 11 – First Steps First: Rootstech Recap – Daniel Horowitz
  • Apr 21 – DNA Roundtable: Relationship Predictors – Leah Larkin (90 min)
  • May 25 – Test. Analyze. Repeat: Long-term DNA Strategies for Success – Diahan Southard.
  • Jun 16 – Finding Your Ancestors in Canadian Land Records – Tara Shymanski, Legacy Family Tree Webinars
  • Jul 14 – Celebrating 2,000 Webinars! plus 10 tips you can use today – Geoff Rasmussen, Legacy Family Tree Webinars
  • Aug 8 – Ten MORE Secrets to Using MyHeritage – Daniel Horowitz, Legacy Family Tree Webinars
  • Aug 23 – DNA Painter Basics: Strategies to Enhance Your Genealogical Research – Adina Newman, Virtual Genealogical Association
  • Oct 2 – Ask the Experts: Katy Rowe-Schurwanz from FamilyTreeDNA – Diahan Southard (30 min)
  • Oct 9 – Ask the Experts: Blaine Bettinger – Diahan Southard (30 min)
  • Oct 23 - Ask the Experts: DNA Painter – Diahan Southard (30 min)
  • Nov 14 – New Updates on Your MyHeritage Family Tree – Uri Gonen
  • Nov 16 – Ask the Wife: A Powerful DNA Strategy – Diahan Southard
  • Nov 20 – Ask the Experts: Michelle Leonard – Diahan Southard (30 min)
  • Nov 28 – The Good News About Historical Newspapers – Daniel Horowitz
  • Nov 30 – Organize Your DNA Matches – Kelli Bergheimer
  • Dec 9 – Ten Awesome Things You Can Do on WikiTree – Connie Davis, Virtual Genealogy Association
  • Dec 12 – The Lastest Developments in Searching Historical Records on MyHeritage – Maya Geier, MyHeritage
  • Dec 15 – Landscape of Dreams: Jewish Genealogy in Canada – Kaye Prince-Hollenberg, Legacy Family Tree Webinars
  • Dec 20 – Got Old negatives? Scan Them With Your Phone and These 5 (Mostly) Free Apps! Elizabeth Swanay O’Neal – Family Tree Webinars

Conferences (Online) – Total 14 hours

  1. Mar 2 to 4 – RootsTech 2023
    • Getting Started in Jewish Genealogy – Ellen Kowitt
    • What’s New at FamilySearch in 2023 – Craig MIller
    • Using DNA to Determine Relationships in 2023 – Beth Taylor
    • How third-party DNA tools can help with your family history research – Jonny Perl
    • Different Ways to Work with Your family Trees – Uri Gonen
    • Tracing Your Jewish Roots in Ukraine – Ellie Vance (30 min)
    • Using Maps and Gazetteers to Locate the Hometown – Ellie Vance (30 min)
    • What’s New in RootsMagic 9 – RootsMagic
  2. Nov 2 to 5 – WikiTree Symposium and WikiTree Day
    • Mastering the Updated Library and Archives Canada Website – Kathryn Lake Hogan
    • DNA Consultations at AmericanAncestors.org – Melanie McComb
    • DNA Group Projects and WikiTree – Mags Gaulden
    • Tech Troubleshooting – What Would You Do? – Thomas MacEntee
    • Keep Your Family’s History Safe for the Future – Marian Burk Wood
    • Reverse Phasing – What and Why? – Kevin Borland
    • Artificial Intelligence (AI) & Genealogy Panel Discussion, Drew Smith, Dana Leeds, Steve Little, Thomas MacEntee, Rob Warthen, Willie

In total, my time for 2023 was 39.5 hours, which is very similar to my 2022 total of 38 hours.


Planning for 2024

Now is a good time to plan in advance your 2024 activities. I like to add them to my calendar as soon as I find something of interest to me that might contain new or updated information.

I plan again to attend RootsTech online from Feb 29 to Mar 2.

image

You can go to their Search the On-Demand Library page and look through their catalog of 4,423 results for over 1,500 sessions from 2019 to 2023 that are still available online. I’m sure you’ll find something of interest there.

They have more than 200 new online sessions planned for 2024. The new session are not yet listed on their site, but they will be soon. When they are ready, you’ll be able to filter your search by year, and 2024 will be an option. Then you can plan the sessions that you’ll want to watch.

Another planning activity to do right now is to check out which Legacy Family Tree Webinars you’ll want to see in 2024. They just came out with their planned classes and they will feature 112 speakers who will be giving 168 talks. That’s almost one every second day. You can find their list and register for the sessions you want here: Upcoming Webinars - Legacy Family Tree Webinars

image

I found 17 sessions that I’m already interested in that I’ve now registered for.

Most of the Legacy Family Tree Webinars are free to watch live. Usually, if you miss the live session, you can still watch it for free for about a week.

Of course, another way to get some Continuing Education is to attend a genealogy conference. I have not attended an in-person conference since before Covid. I was planning to finally go on a Genealogy European River Cruise in October 2024 which was to have featured Judy Russell and Blaine Bettinger as the speakers. I was really looking forward to this, but unfortunately it had to be cancelled. It seems I’ll have to wait a while longer until I find another in-person genealogy conference of interest to me.

Now its up to you to get to it. There’s no time like the present to plan some of your genealogical activities for 2024.