[Israel.pm] a hash question
mjd-list-israelpm at plover.com
Wed Feb 18 04:25:30 PST 2004
> I remember MJD recommending one of the DBM files - do people remember
> which ? and why ?
DB_File is the only one that doesn't seem to have big problems.
ODBM, NDBM, and SDBM all have limits on the sizes of the data that you
can store. For example, with SDBM, each key and the data together may
not be more than 1024 bytes.
Also, these three scale very badly when you try to put a lot of data
# Keys File extent Space used
(ls -l) (ls -s)
1 1024 8
2 2048 8
4 4096 8
8 8192 48
16 120832 120
32 245760 208
64 441344 296
128 4251648 456
256 12701696 1456
512 21091328 2320
1024 33284096 4128
2048 536668160 11592
4096 1065409536 22272
Most unix systems have a 2GB limit on the extent of a file, so even
though we're not storing very much data, the file is soon too big for
the OS to handle.
GDBM doesn't have these limitations. But I don't use it any more
because in 1998 I was using it for a web user database for a major
client; we had about 320,000 registered users, and one day, the
'firstkey' and 'nextkey' routines stopped producing all the keys.
They would generate about 1,700 of the usernames and then stop, so I
couldn't get the list of our users.
I sent a detailed bug report to the GNU folks, offering to do whatever
I could to help, and the reply said:
I have heard of this happening before. I was not able to find
out why. Do you have a backup of earlier versions so you can
get most of your keys out? If so, you might try to recover by
moving to DB-2.? routines. They are still being updated an
developed. gdbm has not had any active development in years.
So I restored what I could from the backup tapes, I switched to
Berkeley DB, and I have not used [[GDBM]] since then.
More information about the Perl