Login to participate
  
Register   Lost ID/password?
The Behold User Forum » Topic           prev Prev   Next next

UTF-8 with BOM does not display right - Categorized in: Report a ProblemReport a Problem

3 posts. Started 5 Sep 2013 by klemens. Latest reply 6 Sep 2013 by klemens. RSS 2.0 feed for this topic RSS
1. klemens (klemens)
Germany flag
Joined: Thu, 5 Sep 2013
9 blog comments, 9 forum posts
Posted: Thu, 5 Sep 2013 Permalink

Hi Louis,

I tried opening a GEDCOM file with umlauts in Behold 1.0.5.1. Instead of the umlauts, boxes were displayed. The file was UTF-8 with BOM. When I converted it to UTF-8 without BOM, everything was fine. Also, Unicode (UTF-16) was fine, too.

Can you replicate this?

Klemens

2. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
288 blog comments, 245 forum posts
Posted: Thu, 5 Sep 2013 Permalink

Klemens,

Yes, I can replicate the problem. Thank you for pointing it out.

I did some debugging and I think I've found the mistake I made.

It was bit dumb on my part. This was my checking code:

                else if GedcomCharsetUsed = 'UTF-8' then begin
                  CharConvert := ConvertUTF8;
                  if (BOMDisplay = '') then
                    LogIt('', '~MCHAR2')
                  else if (BOMDisplay <> 'EF BB BF' { UTF-8 } ) then
                    LogIt('', '~MCHAR3');
                end

It should have been:

                else if GedcomCharsetUsed = 'UTF-8' then begin
                  if (BOMDisplay <> 'EF BB BF' { UTF-8 } ) then begin
                    CharConvert := ConvertUTF8;
                    if (BOMDisplay = '') then
                      LogIt('', '~MCHAR2')
                    else
                      LogIt('', '~MCHAR3');
                  end;
                end

So I was checking if the BOM was the same as the CHAR specified in the GEDCOM and setting up the correct message, but I was converting the string in all cases. When the BOM matches the character set, the string is already correct and shouldn't be converted.

I am surprised nobody else reported this. I guess not too many people use UTF-8 GEDCOMs.

Let me know if this is an important fix for you. If so, I can produce a point update. Otherwise, the fix will be in the next full version.

Louis

3. klemens (klemens)
Germany flag
Joined: Thu, 5 Sep 2013
9 blog comments, 9 forum posts
Posted: Fri, 6 Sep 2013 Permalink

Louis,

thanks for checking.
I can export Unicode GEDCOM from my program, so, for me, there is no need to hurry with a new version.

Klemens

Leave your Reply

You must login to post your reply.

Login to participate
  
Register   Lost ID/password?