Bugs: Slow handling of TCL events with USE_TCL_EVENTS

Chris Dearman chris at mipscd.gotadsl.co.uk
Wed Feb 21 12:06:04 CST 2007


Hi,
    I noticed that retrieving Web pages in response to a user request was
incredibly slow.  Here's the bug report and attached is a patch that is
a potential fix:

=====================================================================
1) INFORMATION ABOUT YOUR EGGDROP

1.1) Eggdrop version:
       1.6.18+long-time_t (CVS head)

1.2) Make type:
       (x) debug

1.3) List of any options passed to ./configure:
--prefix=/usr/local/eggdrop-1.6

1.4) List of patches and/or modules you use:
default modules, no patches

=====================================================================
2) INFORMATION ABOUT TCL

2.1) Tcl library version:
       (x) 8.4

2.2) Tcl library patchlevel: 8.4.9

2.3) Tcl scripts used:
       [x] alltools
       [x] others - Please mention all others:
/usr/lib/tcl8.4/http2.5/http.tcl

=====================================================================
3) INFORMATION ABOUT THE OS

3.1) OS type:
       (x) Linux

3.2) OS Version/Release: Debian 3.1

=====================================================================
4) BUG DETAILS

4.1) The logged last context (example: Last context: userent.c/973 []):
N/A

4.2) If the bot wrote to the file DEBUG, copy the text -contents- of
       that file here (NOTE: It should be about 20 lines of info, but it
       could be a few lines more):
N/A

4.3) Your comments and a description of the bug:

The select call in sockread (net.c) uses a timeout of 1 second. If
eggdrop is built with USE_TCL_EVENTS, the main loop runs one TCL event
and then ends up calling sockread which can sleep for 1s if there is no
file I/O to wake it up. Code that relies on TCL events being dispatched
rapidly can be badly affected by this. In particular code using
"fileevent xx readable" on line orientated input generates one TCL event
per line. The http package uses line mode while it reads the response
headers which can causes the initial header parsing to be very slow if
the bot is lightly loaded.

4.4) Can you cause the bug condition to repeat? If so, please outline
       step by step what causes the error:

Eggdrop must be compiled with USE_TCL_EVENTS set. Load the following
code (it's completely contrived, not reentrant etc., but it demonstrates
the pathological behavior of eggdrops event handling)

namespace eval slowcount {
      bind pub - !countlines slowcount::count

      proc count {nick uhost handle channel text} {
          variable state

          if {[file exists $text]} {
              set state(f) [open $text r]

              set state(file) $text
              set state(start) [clock clicks -milliseconds]
              set state(lines) 0
              set state(nick) $nick
              fconfigure $state(f) -blocking 0 -buffering line
              fileevent $state(f) readable {slowcount::nextline}
          }
      }

      proc nextline {} {
          variable state
          if {[gets $state(f) data] >= 0} {
              incr state(lines)
              fileevent $state(f) readable {slowcount::nextline}
          } else {
              set state(stop) [clock clicks -milliseconds]
              set ms [expr $state(stop) - $state(start)]
         putnotc $state(nick) "$state(file): $state(lines) lines, ${ms}ms"
              close $state(f)
          }
      }
}

Make sure that there is not much background activity going on (otherwise
the select does not sleep)
Run the trigger specifying some local file that eggdrop has access to:

chris> !countlines /etc/passwd
bot> /etc/passwd: 30 lines, 17436ms

It took 17s to read a 30 line file...

4.5) Do you have ideas on what is wrong that causes this error?
       Please list them:

The select should be called with a timeout of 0 if there are other TCL
events pending

4.6) Do you have ideas on how to correct it?  Please list them:

See attached patch. It could be simplified a lot but I wanted to leave
the original code intact. With this patch installed I get the following:

chris>  !countlines /etc/passwd
bot> /etc/passwd: 30 lines, 47ms

4.7) Other comments?
No

4.8) If the bot dumped a 'core' file when it crashed, it would be *very*
       useful if you could paste gdb's output during the following steps:
N/A


Chris


-------------- next part --------------
A non-text attachment was scrubbed...
Name: tclevent.diff
Type: text/x-patch
Size: 3134 bytes
Desc: not available
URL: <http://lists.eggheads.org/pipermail/bugs/attachments/20070221/2dccb15f/attachment.bin>


More information about the Bugs mailing list