Eggdev: Major flaw with 1.9 hash_table() functions (stdarg read)

stdarg stdarg at yahoo.com
Thu Mar 3 12:30:42 CST 2005


Hey,

I'm not sure why I put 100 there, it must be some mistake. That is an awful lot
of collisions before a resize. Feel free to replace it with a #defined constant
and make it more reasonable like something in the range of 1-5 or so heh. Did I
really put 100 in there? That's terrible.

--- Bryan Drewery <lordares at shatow.net> wrote:
> 
> I've been playing around/studying these functions for a while today and
> noticed that there is a major flaw.
> The idea of a hash table is to hash a value and jump directly to it's spot in
> memory, without the need to loop any lists
> (unless possibly because of collisions).
> 
> to create a table, hash_table_create is called with the size of the initial
> table.
> For example, the user list is defined to be 50:
> #define USER_HASH_SIZE  50
> 
> hash_table_t *hash_table_create(hash_table_hash_alg alg, 
> hash_table_cmp_alg cmp, int nrows, int flags)
> {
>          ...
>          ht->max_rows = nrows;
>          ...
> }
> 
> The maximum rows are set to the initial size value, so far so good...
> 
> Now, skipping down over hash_table_insert()..
> 
> ht->cells_in_use is increased for each entry into the table...
> 
> ...at the end of hash_table_insert hash_table_check_resize() is called to 
> see if the table needs to be resized...
> 
> In this function the following is used to check if the table is too small:
> 
>          if (ht->cells_in_use / ht->max_rows > 100) {
>                  hash_table_resize(...);
>                  ...
>          }
> 
> Now, using the defined user table size of 50 and say 4900 users added, the 
> value calculated is:
> 4900 / 50 = ** 98 **
> 98 is still less than the 100 required to resize and add rows to the 
> table...Not until there are 5001 users will the table
> increase from 1 row to 3 rows (there is a multiplier of 3 in the resizing)
> 
> Point being, until their are SIZE*100 entries into the first row, the 
> table is just a ** 1 ** row linked-list, which
> defeats the entire purpose of having the hash table functions...
> 
> I did some verification checks to this and the resulting hash % cells was 
> always: ** 0 **, meaning the first row.
> 
> Reviewing egglib/cvs, I see that the original code simply add a linked 
> list at collisions as is the best way of handling
> them and not dealing with resizing...
> or perhaps lowering the row percentage from 100 to say 2.
> 
> Seems that when the hash_table.* files were copied over from lib/egglib to 
> lib/eggdrop, changes such as this occurred and
> went unlogged by stdarg.
> 
> 
> ------
> Bryan Drewery
> 
> 


	
		
__________________________________ 
Celebrate Yahoo!'s 10th Birthday! 
Yahoo! Netrospective: 100 Moments of the Web 
http://birthday.yahoo.com/netrospective/



More information about the Eggdev mailing list