Why are there long delays in Mailman's message delivery?

There is a significant delay between Mailman's receiving a post and its delivery to the list members. The post is archived almost immediately, but is not sent to the list members for tens of minutes or even hours or more.

The most common reason for this in Mailman is a backlogged out/ queue. This queue is $var_prefix/qfiles/out/ or in some RedHat/CentOS packages /var/spool/mailman/out/. Normally, this directory is empty or contains one .bak file and perhaps a few .pck files. The .bak file contains the message currently being processed by OutgoingRunner. The .pck files are additional queued messages waiting to be picked up. OutgoingRunner is single threaded but can be 'sliced' so there are multiple OutgoingRunner processes each processing a slice of the queue. The default is only a single OutgoingRunner process. In any case, there will be at most one .back file per OutgoingRunner process.

Sometimes the queue becomes backlogged. The symptoms of this are the delivery delay mentioned above plus a larger number of .pck files in the queue and 'continuous logging' in Mailman's smtp log.

Each processed message, notice, etc. from Mailman results in an smtp log entry similar to

Jul 25 07:34:09 2016 (19838) <message_id> smtp to listname for nnn recips, completed in tt.ttt seconds

When the queue is backlogged, each successive entry will have a timestamp (07:34:09 in this case) which is tt.ttt seconds later than the prior entry. I.e., there is no OutgoingRunner idle time between messages.

This condition can come about because there simply aren't enough machine resources to process the load, but more usually it occurs because of something in the outgoing MTAs configuration that makes the SMTP delivery from Mailman slower than it could be.

Mailman's delivery rate, nnn recips divided by tt.ttt seconds, should be on the order of 20 or more recipients per second, even with full VERP. If it is significantly slower than that, there is probably something in the MTA causing it to be slow.

The most common cause is the MTA validating the recipient addresses at incoming SMTP time. If this is an issue, you can set up an alternate port on localhost for Mailman to use and disable most checks on mail arriving there. See "Comments from Brad Knowles" at this article for more info.

MailmanWiki: DOC/Why are there long delays in Mailman's message delivery? (last edited 2017-09-22 20:48:37 by msapiro)