Eggdev: Major flaw with 1.9 hash_table() functions (stdarg read)

Bryan Drewery lordares at shatow.net
Tue Mar 1 18:00:53 CST 2005


Ah, forgot to add this:
in hash_table_find()

--      idx = hash % ht->max_rows;
--      row = ht->rows+idx;

         for (entry = row->head; entry; entry = entry->next) {
                 if (hash == entry->hash && !ht->cmp(key, entry->key)) {
                         *(void **)dataptr = entry->data;
                         return(0);
                 }
         }

Just a linked list search, (always) at rows[0]


------
Bryan Drewery


On Tue, 1 Mar 2005, Bryan Drewery wrote:

>
> I've been playing around/studying these functions for a while today and 
> noticed that there is a major flaw.
> The idea of a hash table is to hash a value and jump directly to it's spot in 
> memory, without the need to loop any lists
> (unless possibly because of collisions).
>
> to create a table, hash_table_create is called with the size of the initial 
> table.
> For example, the user list is defined to be 50:
> #define USER_HASH_SIZE  50
>
> hash_table_t *hash_table_create(hash_table_hash_alg alg, hash_table_cmp_alg 
> cmp, int nrows, int flags)
> {
>        ...
>        ht->max_rows = nrows;
>        ...
> }
>
> The maximum rows are set to the initial size value, so far so good...
>
> Now, skipping down over hash_table_insert()..
>
> ht->cells_in_use is increased for each entry into the table...
>
> ...at the end of hash_table_insert hash_table_check_resize() is called to see 
> if the table needs to be resized...
>
> In this function the following is used to check if the table is too small:
>
>        if (ht->cells_in_use / ht->max_rows > 100) {
>                hash_table_resize(...);
>                ...
>        }
>
> Now, using the defined user table size of 50 and say 4900 users added, the 
> value calculated is:
> 4900 / 50 = ** 98 **
> 98 is still less than the 100 required to resize and add rows to the 
> table...Not until there are 5001 users will the table
> increase from 1 row to 3 rows (there is a multiplier of 3 in the resizing)
>
> Point being, until their are SIZE*100 entries into the first row, the table 
> is just a ** 1 ** row linked-list, which
> defeats the entire purpose of having the hash table functions...
>
> I did some verification checks to this and the resulting hash % cells was 
> always: ** 0 **, meaning the first row.
>
> Reviewing egglib/cvs, I see that the original code simply add a linked list 
> at collisions as is the best way of handling
> them and not dealing with resizing...
> or perhaps lowering the row percentage from 100 to say 2.
>
> Seems that when the hash_table.* files were copied over from lib/egglib to 
> lib/eggdrop, changes such as this occurred and
> went unlogged by stdarg.
>
>
> ------
> Bryan Drewery
>
>



More information about the Eggdev mailing list