When I cleaned up the memory leaks, I thought I had fixed much of Behold’s memory handling. But then I started finalizing the log file (the last thing before the next beta release) by adding a memory reporting line and I was a little surprised at what still seemed like too much memory use.
It went back again to my 25 MB test GEDCOM with 94,000 people in it. That file now loads in only 8 seconds but still uses 226 MB of memory. That’s 9 times the file size.
So I did a bit of tracing and got a surprising result. Loading the data only uses 138 MB. Since Behold uses Unicode characters, the 25 MB input file immediately gets doubled in size to 50 MB. Add on the data structures and all the derived data, links, pointers and indexes for them and that gives the 138 MB.
Then where are the other 88 MB coming from? I was very surprised that it is from the TreeView. 35 MB were from the text and link in each of the 160,000 lines of the TreeView. But the other 53 MB were from the data structure of those 160,000 nodes in the Treeview. That means each node was using about 330 bytes of memory. That’s a bit much, compared to Virtual TreeView which says it only uses 60 bytes per node.
Is this really important? For files up to about 600,000 people, everything can still be stored in under 2 GB and most computers have that and more. But Behold can’t go much higher, since 32 bit Windows programs can only address 2 GB (or 3 GB with a special trick). And extra memory allocations do take time as well.
I still do want Behold to be able to handle super-large files and be a potential winner of the Confucious Cup. So although this isn’t really important now, I’ve got a few future solutions available to reduce the memory hogging burden that Behold now runs into for large files:
- Use an online database to store the GEDCOM data. This will be done in Version 1.1
- Replace or optimize the TreeView to use only what it needs to use. One day I’ll do this, but not important now.
- Make a 64-bit version of Behold to eliminate the 3 GB limit. Can’t do this yet. Embarcadero hasn’t finished their 64-bit Delphi compiler. They are working on it (mid-2010?).
Again I ask the question: Is this really important? For most of you with files that I now consider “small” for Behold (say up to 100,000 people), then no. Behold will be fast and 250 MB of memory is really unnoticeable on computers today.
But for big files … I’ll get there.
—
Addenum: I reported the excess memory use by ElXTree to LMDInnovative. They’ve added it to their bug tracker and may have some improvements for me in a few months - without me necessarily needing to hack the tree routine for myself.
Joined: Mon, 12 Jan 2009
36 blog comments, 59 forum posts
Posted: Sun, 11 Apr 2010
But does this mean anything that will effect the next beta release timetable.
Joined: Sun, 9 Mar 2003
288 blog comments, 245 forum posts
Posted: Mon, 12 Apr 2010
No, this won’t really affect the beta schedule. The log file is the last thing I’m working on for this beta, and it alerted me to this memory issue. I posted this because it is interesting. I’ll handle it one day, but I don’t plan on going on wild goose chases now. I had hoped to get Version 1.0 out in April. So that won’t happen but I’ll push for May. Really, other than the log file, a few outstanding bugs to fix, and the documentation update - Behold 1.0 is almost ready to go.
Joined: Sun, 9 Mar 2003
288 blog comments, 245 forum posts
Posted: Sun, 18 Apr 2010
Brett: Well as it turns out, I could not keep myself from it and did spend about 4 days on memory improvements this week, which managed to reduce the 226 MB down to 182 MB. So it did slow me down 4 days. But that is all good, as this work also picked out some bugs that would have been necessary to catch during the beta anyways.