This entry is old, but will be retained for historical purposes. For Mailman versions 2.1.4 and above, you should instead see FAQ 4.66 called 4.66 How do I disable "bulk" pipermail archive files (.mbox and .txt)?.

SPAM PROBLEM:

 HYPOTHESIS
 SOLUTION 2.1.4

 HACK SOLUTION 2.0.11 (AND PROBABLY 2.1.1)
 (1) remove link to *.mbox in
 /var/lib/mailman/Mailman/Archiver/HyperArch.py
 (2) remove compiled versions HyperArch.py[oc]
 (3) test on an archive (e.g. <mylist>)
 (4) make the *.mbox/*.mbox files inaccessible (since google links remain)
 (5) think about newlist s (maybe nothing needs doing)

SPAM PROBLEM:

I think this is a known spam problem - by default, it looks like the archive files linked under "download the full raw archive" like the following etc are NOT antispammed:

http://lists.mydomain.org/mailman/public/mylist.mbox/mylist.mbox

It seems to be present in 2.0.11 and 2.1.1

HYPOTHESIS

IMHO the reason why this is probably not easy to solve is that this is where mail is automatically saved when it's received. If this is filtered by " at " -> "@" then it means that overall there are typically 4 copies of the entire mailbox (e.g. html version, monthly archives, true mailbox with @ hidden from external access, and " at " version for web access).

i couldn't find if this has been discussed, but it looks like there's a simple solution in 2.1.4.

SOLUTION 2.1.4

It looks like the solution in mailman-2.1.4 is to offer different templates:

 mailman-2.1.4/templates/en/archtoc.html
 mailman-2.1.4/templates/en/archtocentry.html
 mailman-2.1.4/templates/en/archtocnombox.html   ->  this one has no mbox

e.g. http://mail.python.org/pipermail/mailman-announce/ does not link to *.mbox/*.mbox

However, the raw mbox does exist in http://mail.python.org/pipermail/mailman-announce.mbox/ and it is publicly accessible. (Please don't post the exact URL - no need to google a hidden file.)

Note - in 2.1.4 and above the presence of the link in the archive index (i.e. the choice of which template to use) is controlled by the mm_cfg variable PUBLIC_MBOX which defaults to 'No'. If you change this in mm_cfg.py, you need to restart Mailman with

 bin/mailmanctl restart

and rebuild the archives for existing lists with

 bin/arch --wipe <listname>

HACK SOLUTION 2.0.11 (AND PROBABLY 2.1.1)

In 2.0.11, the line pointing to the .mbox is in

/var/lib/mailman/Mailman/Archiver/HyperArch.py (for Debian anyway)

      You can get <a href="%(listinfo)s" >more information about this list</a>
      or you can <a href="%(fullarch)s" >download the full raw archive</a>
      (%(size)s).
      </p>

solution:

(1) remove the link to the full .mbox in /var/lib/mailman/Mailman/Archiver/HyperArch.py

To do this,

replace

      You can get <a href="%(listinfo)s" >more information about this list</a>
      or you can <a href="%(fullarch)s" >download the full raw archive</a>
      (%(size)s).
      </p>

by

      You can get <a href="%(listinfo)s" >more information about this list.</a>
      </p>

(2) remove compiled versions (in my case the .pyc gets automatically recompiled)

 rm /var/lib/mailman/Mailman/Archiver/HyperArch.py[co]

(3) test this

 cd /var/lib/mailman/archives/public/
 /usr/lib/mailman/bin/arch mylist

Then check out:

 http://lists.mydomain.org/pipermail/mylist/

Hopefully there will be no link to the .mbox However, direct access by a spam harvester/spider knowing the URL will still be possible.

(4) So, make the .mbox file inaccessible - since google links will still hang around for some time

 chmod go-rw /var/lib/mailman/archives/private/*.mbox/*.mbox

(5) probably nothing needed when running newlist for new lists.

Making new lists will, by default, write *.mbox/*.mbox which are web accessible, but nobody reasonable and careful is going to link to them.

Known threads on this issue:

 http://mail.python.org/pipermail/mailman-developers/2004-February/016569.html
 http://zope.org/Members/bwarsaw/MailmanDesignNotes/MailmanProblems (wiki)

Converted from the Mailman FAQ Wizard

This is one of many Frequently Asked Questions.

MailmanWiki: DOC/4.34 I don't want the "download full raw archive" link which provides non-antispammed email addresses (last edited 2015-01-31 02:36:58 by msapiro)