The problems causing the mail servers to periodically "spaz" and cause one or two of them to not accept mail correctly might actually be fixed now. There was an incorrect usage of a mutex around some socket code in the Spam & AntiVirus filtering modules which caused various threads to trounce on themselves.
Debugging multithreaded applicaitons is always kind of... random. You might see a problem sporadically, but not consistantly enough to really debug or test fixes to a problem. Luckily(?) last night, the moon was in the correct phase to cause whatever problem was cursing the milter code to consistantly fail. After examining what was going on, I saw that the mutex that I was using around the Perl IO::Socket calls wasn't shared between the ClamAV and Spam filter modules. Since the perl IO::Socket modules call some very non-thread-safe functions, it was probable that this might have been the source of the failures.
After some fighting with the Perl threads::shared functionality (it's really picky about how you *reference* thread-shared variables), I got the mutex shared between the two modules, and things started to make a lot more sense.
It seems now that the thread count for the milters are directly tracking the count of incoming SMTP connections -- which is something that wasn't quite true previously -- threads seemingly became abandoned. I thought that this was just bugs in the Milter libraries themselves, but it may have been this mutex problem all along.
One interesting statisitic I saw from adding a little more debugging logging to the milter was that, typically, a message is virus and spam checked in less than a second. That's pretty cool.