Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Conflicting Information in GEDCOM - Tue, 9 Nov 2021

An issue about GEDCOM has once again come to my attention.

In the GEDCOM 5.5.1 standard they write:

Conflicting event dates and places should be represented by placing them in separate event structures with appropriate source citations rather than by placing them under the same enclosing event.

I addressed this over 8 years ago in my article: Multiple Events and Unions in GEDCOM where I said this:

What this means is that if you have two conflicting sets of information for an event, such as a birth event, then there should be separate event structures for them, e.g.:

1 BIRT
2 DATE 1880
1 BIRT
2 DATE 1870

Presumably you’d have more information with each including the full dates, the places, your sources and notes about each bit of evidence. Because of the GEDCOM rule, the first of the two would be considered the preferred, i.e. most credible date.

This is all fine and good for events like Birth and Death that, other than extremely extended circumstances (e.g. brought back from a coma, or science fiction), normally occur only once in any person’s life.

The trouble is that almost any other event can occur multiple times in a person’s life: adoption, naturalization, census, education, retirement. There have been people who have had multiple baptisms and even multiple burials.

This results in a problem. For events other than Birth and Death, if the events are represented like the 4-line GEDCOM example above, how do you tell if they are two different events of the same type, or if they are two sets of conflicting information about the same event?

The answer is, you can’t. GEDCOM does not explain how to distinguish the difference.


A Standard Needs to be Standardized

You would want a standard like GEDCOM to be followed by all developers. You would hope that the standard is internally consistent in how similar objects are represented.

Here we have an inconsistency, where more than one occurrence of the same event type can represent either:

  1. Two different events, or
  2. Conflicting information for the same event.

This type of inconsistency should never happen in a standard. So what is the possible solution?

Conflicting events are not multiple events. They are different versions of the same event based on different sources that give different information for the event. If we are talking about the same event, then all the information available for the event, conflicting or not, should be included in the event, for example, one idea to change GEDCOM might be like this:

1 BIRT
2 DATE 1880
3 SOUR @S1@
2 PLAC Wilmington, Delaware, USA
3 SOUR @S1@  
2 ALTDATE 1870
3 SOUR @S2@ 
2 ALTPLAC New York City, New York, USA
3 SOUR @S2@

So this add new tags ALTDATE and ALTPLAC to indicate alternative information for the same event. Note that the source for that information is indicated.

Personally, I don’t like the idea of GEDCOM adding a million new tags, and adding each item of information individually unnecessarily causes repetition of source information. So this would not be an easy thing for developers to manage.

Maybe what would be better then, would be to include groups of alternative information with each group denoted by a single new tag, maybe ALT, e.g.:

1 BIRT
2 DATE 1880
2 PLAC Wilmington, Delaware, USA
2 SOUR @S1@ 
2 ALT
3 DATE 1870
3 PLAC New York City, New York, USA
3 SOUR @S2@ 
2 ALT
3 DATE 1875
3 SOUR @S3@

And then conflicting information for non-unique events, which don’t currently have a mechanism, can be done the same way.

I should mention that FamilySearch GEDCOM 7.0 does not address the issue of conflicting information and has the same problem as GEDCOM 5.5.1.


NO Is Wrong

While I’m at it, I should mention one of the few changes FamilySearch GEDCOM 7.0 introduced is sort of related to the conflicting information issue.

FSG 7.0 introduced a NON_EVENT_STRUCTURE indicated by the tag: “NO”, which they say:

Indicates that a specific type of event … did not happen within a given date period (or never happened if there is no DATE substructure).

with this example:

1 NO MARR
2 DATE TO 24 MAR 1880

Well gee thanks! That will break just about every genealogy developer’s code, will need to be handled for every possible event tag, and may require changes to the program’s database as well.

In this particular example, I don’t know why the date can’t be specified as:

1 MARR
2 DATE AFT 24 MAR 1880

And if you wanted to indicate that the couple didn’t marry, why not follow the model that GEDCOM 5.5.1 already had and allow it to be specified as:

1 MARR N

After all, GEDCOM already allows the following to indicate that a marriage happened but without additional information:

1 MARR Y


Two People Married More Than Once

The issue of conflicting information was brought back to my attention a few days ago by a discussion in a GEDCOMGeneral Google group. The question raised was if two people married and then separated and then married a second time, should they be included as one FAM (Family) record, or two? 

e.g. as one FAM

0 @F1@ FAM
1 HUSB @I1@
1 WIFE @I2@
1 MARR
2 DATE 1950
1 DIV
2 DATE 1960
1 MARR
2 DATE 1970

or as two FAMs:

0 @F1@ FAM
1 HUSB @I1@
1 WIFE @I2@
1 MARR
2 DATE 1950
1 DIV
2 DATE 1960

0 @F2@ FAM
1 HUSB @I1@
1 WIFE @I2@
1 MARR
2 DATE 1970

In the first case, children from both marriages will be together. In the second one, they would be split into the two families, even though they are full siblings.

Well the issue gets more complicated. What do you do then with a child born between the divorce and the 2nd marriage?

This really should not be an issue at all. It is clear that there should be only one FAM representing all the relationships of two people and all the children they have. Multiple MARR tags should be allowed under a single FAM tag.

But, when they are, are they considered to be two marriage events, or conflicting information for one marriage event?  GEDCOM is ambiguous.


Remembering Past Articles

It’s not good enough for the people writing the standards to just think about an issue and imagine what might be best. Each issue should be studied in detail, and when it comes to GEDCOM, there has already been a lot of study and discussion of most issues. Years of BetterGEDCOM, FHISO, and independent thinking by many genealogy developers should not just be re-thought without referring back to the work that has already been done.

So let me refer you back to:

  1. My article from 2013:  Multiple Events and Unions in GEDCOM
  2. Tamura Jones’ article from 2019:  Married, Divorced, Married Again

In case you don’t want to bother reading the two articles, both conclude that because of the way GEDCOM was written, and because of the way developers implemented GEDCOM, a FAM needs to represent just one union. The MARR and DIV (or ANUL) events therefore represent the start and the end of the union, just like the BIRT and DEAT events represent the start and end of an INDI which represents one life. Multiple MARR and DIV events within one union represent conflicting information, just as multiple BIRT and DEAT events represent conflicting information within an INDI.

All other tags, e.g. CENS, RESI, OCCU, EDUC, etc., represent events that can occur multiple times, so there’s no way to represent conflicting information for those events.  This is an inconsistency in GEDCOM that should be fixed.

Until this change to conflicting information is made, the FAM must remain as one union from MARR to end of MARR.  But once multiple events no longer are used to represent conflicting information, then the FAM concept can be changed to represent the more logical concept of all the relationships between two people and the children they have together.

So the GEDCOM-standards writers really need to change how conflicting information is handled so that the FAM concept can be repaired.

My Computer History - Sun, 7 Nov 2021

Prompted by this week’s Saturday Night Genealogy Fun Genealogy post by Randy Seaver, I thought I’d like to document this in a blog post.

1971: As I entered high school (grade 10), my super-smart neighbor and friend who was two grades ahead of me recommended I follow his lead and get into programming at school. The high schools in Winnipeg had a Control Data Corporation (CDC) mainframe and our school had a card reader and printer that connected to it.  We learned FORTRAN and I had fun with my best friend Carl writing various programs. See: 25 Years of Delphi

1974: My friend Carl and I both wrote computer programs to play chess. In Grade 12, we had our programs play each other..This was covered in both of our city’s newspapers. Carl called it a contest between brute force and finesse. See: The Beginnings of a Chess-playing Program and BRUTE FORCE vs FINESSE.

1974: I took Statistics at the University of Manitoba and mixed a few Computer Science courses in as well. Hundreds of students would stand in line to use the keypunch machines (the older KP-26 and the newer KP-29 models) and then stand in line at the card reader and hand their deck of cards to the person whose job was to feed the cards into the card reader. We’d then walk past another person who was separating the fan fold paper coming out of the printer and then placing each of our outputs on the pickup table. If our coding had an error, it required standing in line at the keypunches, retyping the cards that needed fixing and repeating the process.

1975-1977: Fortunately, dumb terminals were becoming available at the University. These were Cathode Ray Tubes (CRTs) that simply acted as an  interface to the University’s mainframe. What that accomplished was to store the programs on the Mainframe, so no more computer cards!

My first genealogy program was a Script Document Processor utility on my University’s mainframe. It used markup similar to HTML to specify how to make everything look, and included features to create a table of contents and an index of names and an index of places.

In what remaining spare time I had at University, I also continued to work on my chess program.

I worked as a summer student for 3 years at Manitoba Hydro, our electrical utility in the province. They liked my FORTRAN knowledge and my math/stats skills and I got to work on cleaning up the code of some of their mainframe programs to help design Hydro Towers and place the Towers optimally along their route.

1977-1978: My Chess program Brute Force was accepted into the 8th and 9th North American Computer Chess Championships. The 8th took place in Seattle, Washington, and the 9th took place in Washington, D.C. We would use modems to relay the opponent’s move to our home computer and wait for our programs response which we would then physically make on the board for it. See: .Computer Chess - A Memorial to Brute Force

1978-1980:  I completed my Masters Degree in Computer Science at the University of Manitoba.

1980- 1988: I was hired full-time at Manitoba Hydro after I graduated and worked my first 8 years as a programmer and systems analyst working on various engineering projects and models. Our company had its own mainframe, and we developed engineering systems in FORTRAN, one in PL/I and one in Pascal on Apollo Computers which were UNIX-based minicomputers that were awesome!

1988:  At Manitoba Hydro, I accepted a position in the Load Forecasting Department. This was my real introduction to PCs. The company had been using 286 computers up to that time. One of my first tasks was to justify to our Division Manager the purchase of what would be the most powerful computer in the company: A Compaq 386 20 Mhz computer for $10,000, a 300 MB hard drive for it for $10,000 more, and the Operating System and Software for $5,000 more. We got the computer and I started developing our Department’s Customer Information Database on it. We used a database called PC-FOCUS developed by Information Builders which was a fantastic program.

1990: My use of PCs at work for the past few years gave me an wanting for one at home. It wasn’t until about 1990 that prices came down to something reasonable and I purchased an IBM PC 286 no-name clone for about $2,500. I think it was a 12 MHz computer with 8 MB of RAM and a 20 MB hard drive. 

1992-1993: Hard drive capacity was growing fast. I upgraded in 1992 to a 60 MB hard drive and in 1993 to a 260 MB hard drive.

1992-1995: I tried various genealogy programs. The one I liked best was Reunion for Windows. I used it until 1997 when Leister sold it to Sierra who were developing it to be released under the name of Generations. I became a Beta tester for Generations. Sadly Generations was purchased by Genealogy.com and simply dropped it, supporting their own Family Tree Maker program instead. My last entry of my genealogy data into Generations was in 1999. I never updated my genealogy data again until 2018 when I started using MyHeritage and Family Tree Builder. See: So How’s My Genealogy Going

1995: Upgraded my system board finally to a 386 and 8 MB of RAM.

1997: I had to upgrade my computer by buying 16 MB more RAM for $99  to get to 24 MB RAM and replace my 260 MB hard drive with a 2 GB hard drive for $360 so that I could upgrade from Windows 3.1 to Windows 95.  See: Computers 23 years ago

1999: Purchased a new computer with an Intel Pentium III at 600 MHz running Windows 98.

2006: Purchased an HP Media Center PC, 3 GHz, 1 GB RAM. See: Wednesday, January 11, 2006. Two days later, my old Windows 98 computer died: See: Saturday, February 4, 2006

2007: Upgraded my computer to Windows Vista: See: Sunday, June 3, 2007. Surprisingly, I never had the troubles others had with Vista. Worked fine for me.

2009: Purchased a PC with an AMD Phenom 9650 Quad-Core CPU and 7 GB RAM running 64-bit Windows Vista.

2010: This was the tech I had at the time: What I Do

2014:  My current computer was now five years old. See: When Is It Time To Get A New Desktop Computer. So I purchased an HP Envy 700-209 with an Intel i7-4770 Quad-Core with 12 GB RAM and a 2 TB hard drive running 64-bit Windows 8.1. It was 3 times faster. I bought and installed a 240 GB SSD (Solid State Drive). See: Setting up a Solid State Drive with Windows 8.1 – I also bought two identical HP Pavilion 23tm (23 inch) monitors which I love and have been using ever since and I hope they never die.

2019:  Never tried Python before, so I had a bit of fun with this:  50 Years, Travelling Salesman, Python, 6 Hours

2020:  My HP Envy died. See: When Everything Fails At Once. I replaced it with a HP Z420 Xeon Workstation with 32 GB RAM, 512 GB SSD and a 2 TB hard drive for $990 with 64-bit Windows 10 installed on the SSD drive.

Today: I’m very happy with my current Xeon computer. However, I’m very disappointed that it does not meet the minimum system requirements for Windows 11. The CPU is not supported and it only has TPM 1.2 and not 2.0. I’ll likely wait to see if Microsoft loosens the requirements a bit to allow my machine to upgrade. If not, I’ll probably wait until the end of life of Windows 10 in 2025 and buy a new computer that already has Windows 11 installed.

Also: Today, Nov 7, 2021 is my 19th blogiversary. My first blog post was 19 years ago on Nov 7, 2002. And this post is my 1200th post!

Those of you who see me on Zoom will see this background behind me. When I’m on Zoom, I’m actually sitting at this desk with a blank wall behind me. My HP Z420 desktop is at the back left of the desk and you can see my two HP Pavillion 23 inch monitors. In front of my desktop is my Epson DS-860 scanner. Behind it is my Epson WF-4740 printer. Above my desktop on the wall is my Boomer and her Friends calendar.

Taking a Salt Lake Institute of Genealogy (SLIG) Course - Wed, 20 Oct 2021

Today, I completed week 6 of my 10 week SLIG course.

Every Wednesday afternoon between Sept 15 and Nov 17, I’m being presented with two 75 minute lessons on Researching Russian Genealogy Records. There’s a 30 minute break between the lessons. There’s 15 minutes after the 2nd lesson where we’re given our homework assignment. And we review the previous week’s homework 15 minutes before the first lesson.

This is a virtual course, one of the Salt Lake Institute of Genealogy’s Fall Virtual 2021 offerings.


Why Am I Taking This?

It’s been a long time since I last took a course in anything, and I never expected that I’d ever be taking a genealogy course. I’ve been doing my genealogy for over 45 years, learning in the beginning via books and genealogy clubs and finding ways to use a computer (pre-internet) to record my family information in a useful form.

I gained a lot of experience over the years and have written articles and given lectures and workshops on various aspects of both standard and genetic genealogy, but I had never taken a genealogy course before. So why now?

Two of my grandparents and all four of my wife’s grandparents are from what is now the Ukraine and what was at the time the Russian Empire. All of my research on all these lines only went back to the immigrant generation and not much more.

About 2 years ago, I was contacted by Russian researcher Boris Makalsky. He saw my JewishGen Family Finder entries. He emailed me and said he found records for my wife’s Furman family in Zhitomir and asked me if I was interested in acquiring them. For a reasonable price, Boris sent me images of the original documents in Russian along with an English translation. Since then, Boris has found me over 100 documents extending one of my grandparents’ and two of my wife’s grandparents’ lines back to the early 1800’s, dispelling several family myths and adding scores of new relatives living in different towns for me to research.

Today, there are a lot of Russian records available in Russian archives. Most archives require a researcher to visit in person to do research, and you pretty well need to know Russian for that. Only a small number of those records have been photographed. A fraction of those have been made available online, and very few have been translated and indexed.

But new initiatives such such as those by Miriam Weiner and Alex Krakovsky have recently been bringing many more images of records online. I’ve been inspired by the work of Lara Diamond who writes in detail on her Lara’s Jewnealogy blog how over the past few years she has acquired Russian records and greatly extended her family research. Lara does not speak Russian, but she taught herself how to read it and became an expert in doing so.

In August, I saw Judy Russell’s blog post: SLIG 2021 registration opens Saturday. I know Judy often teaches advanced courses at SLIG, IGHR, GRIP, MAAGI, Gen-Fed and elsewhere and was curious what SLIG was offering. So I followed her link to the SLIG site and there I saw this:

image

It looked like just what I needed. The clincher was that Lara Diamond was one of the instructors and the specific inclusion of Jewish research.


In-Person versus Virtual

Up to a few years ago, to take a SLIG courses, you had to attend in-person in Salt Lake City for a week. The course was Monday to Friday with two morning classes, two afternoon classes, and homework each evening.

I’d have to think that’s pretty intense. Also expensive, both in terms of cost (getting to Salt Lake City and food and accommodations for at least 5 nights) and in time away from the family.

I’m not the best at learning or memorizing things. I have to practise to get familiar with anything, and a few hours each evening for homework or studying isn’t enough for my brain to absorb anything.

The pandemic has also been putting a damper on these in-person events.

So virtual sounded just right. Two classes each week and a week in-between was just perfect to get comfortable with all the material.

What about the social aspect?  One nice thing about an in-person course is interacting with the instructors and socializing with the other students. Don’t we lose that?

Well not really. Over the past year and a half, the world has got quite used to using Zoom. And Zoom is the platform for the SLIG virtual courses. The instructors and most of the students have their cameras on, and we see each other and can ask questions live. We can also type into the chat area to mention something to someone or everyone without disturbing the instructor.

Then a private Facebook group was set up for the students and instructors of our course allowing us to communicate in-between classes.

In addition, a study group was set up, where those of us who wanted got together in a Zoom session on Thursday or Sunday to “study”, which more accurately could be described as “compare homework answers”.

Rather than making and spending time with some new friends for just a week in person, the virtual session is allowing our group to be together for a full 10 weeks.

There’s a lot to like about a virtual course.


The Course Itself

Joe Everett is the leading the course and has proven to be an excellent instructor and has really done a good job setting up and administering this course.

These courses are made for the experienced genealogist. The assumption is we want to become able to research Russian documents ourselves. And the course is set up to help us do that.

The only pre-course requirement was that we become familiar with and memorize the Russian alphabet. To make this easier, prior to the start of the course, Joe mailed each of us a set of business-card sized flash cards that he designed to help us work on learning what each letter looked like, sounded like, and what it transliterated to.

image

I spent a few weeks working on learning the alphabet. What worked well for me was singing the Russian alphabet to the tune of the ABC song, which is the same tune as Twinkle, Twinkle, Little Star. The lines I made for the song are::

Ah beh veh geh deh yeh yo,
zheh zeh ee kah el em en oh peh,
ehr ess teh oo ef kha tseh,
cha shah shyah yeri eh you yah,
Now I know my ah beh vehs,
Next time won’t you sing with me.

(It doesn’t include ee kratkoye or the two znahks, but they’re very easy to remember on their own.)

We were also supplied with a huge 292 page Syllabus containing all the course notes. That document alone is worth the price of the course.

During the first six weeks (12 classes), we’ve been learning how to read Russian cursive (i.e. handwriting) because the majority of documents are hand-written. The tricky part is interpreting and adapting to the differences in people’s penmanship. We learned the minimum (thank goodness) that we needed to know about Russian grammar. We learned the basics about Russian history and how the Russian Empire boundaries changed over time. We learned about the various records available (Metrical, Civil Registration, Revision Lists and the 1897 Census, Military and other) and the various methods of finding and obtaining them.

Heather Stewart, Ellie Vance and Lara helped Joe out as instructor for half the classes.

The goal here is not for us to learn Russian. The goal is for us is to be able to find Russian records for the surnames and places we are interested in, and to be able to extract the information we want from those records. We are given the knowledge and tools we need to do so, which includes Google translate, transliteration aids and a Russian Genealogical Word List.

There are about 16 people taking the course. Most of them are members of the Association of Professional Genealogists (APG) so I assume many are taking the course to help them to do client work. My needs are personal.

Our last homework was to extract the genealogical information from a number of Russian records. I spent about 20 hours last week working on the assignment and reviewing the methodology in detail. That is exactly what I wanted to learn from this course.


Still To Come

There’s 4 more weeks left. Our 2nd lesson today was titled “Putting the Pieces Together, Part I”. I’m still looking forward to: Research in Ukraine; Jewish Genealogy Research; Where are the Records? Accessing Archival Records; Discovering the Place of Origin and Family Context; and Locating the Ancestral Home using Maps and Gazetteers.

We’ll end with “Putting the Pieces Together, Part 2” and a final homework assignment, which I hope will be to find records of my own family.

I’m really enjoying this course, and I’m excited about how it will help my genealogical research.