DEV/Google Summer of Code 2013

GNU Mailman is hoping to participate in Google Summer of Code in 2013. This page captures some ideas for student projects, links to setup guides, etc.

Table of Contents:

Contents

Getting Started
Getting in touch with the developers
Project Ideas
Not-fully-defined Project Ideas

Getting Started

Interested in getting started with Mailman development? Here are some links for potential students and mentors:

Mailman is written in python. For developers, the path of least resistance is probably to use python 2.7+ (sadly not all our requirements handle python 3 yet) and work under Ubuntu Linux, although it is definitely possible to do development on other platforms. It is definitely possible and encouraged to run your dev environment in a VM, although we do not currently have one set up and available to download. (If you'd like to set one up and make it available for future developers, please email mailman-developers@python.org so we can help!)

Development work on Mailman 2.1 has been frozen for some time, so all new project ideas should be related to Mailman 3.

Getting in touch with the developers

If you're interested in participating in GSoC 2013 as a student, mentor, or interested community member, you should join the Mailman Developers mailing list: http://mail.python.org/mailman/listinfo/mailman-developers

Post any questions, comments, etc. to mailman-developers@python.org.

In addition you may be able to find us on IRC at #mailman on irc.freenode.org. If no one is available to answer your question, please be patient and post it to the mailing list as well. (We *are* the developers for Mailman -- unsurprisingly, most of us prefer to communicate via email.)

Project Ideas

Here are a few ideas for projects we would love to see completed. If you have another idea you'd like to propose (either as a prospective student, mentor, or interested community member), please send it to mailman-developers@python.org for discussion!

Log monitor

(Suggested by rsk)

Write Python scripts that monitor all of Mailman's logs, integrate what they find, and complain to the appropriate -owner addresses when bad things start happening. Bad things could be any of:

- repeated attempts to subscribe from same IP address/network block

- sudden increase in inbound traffic to otherwise-quiet list

- spike in unsubscriptions (maybe someone is forging them)

- etc.

Bonus points: integrate with HTTP and SMTP logs from which web server and whichever MTA are in play. (But this might might be better as a followon the next year. Given the diversity of the software in play and thus the diversity of the logs, it's a lot. So maybe the goal should be "don't design or code in a way that rules out doing this in the future".)

Boilerplate stripper

(Suggested by rsk)

An increasing number of devices and services are starting to forcibly append boilerplate to users' email messages. This is annoying.

I'm talking about:

        Sent from my Android
        Sent from my Droid
        Sent from my HTC
        Sent from my Nexus

as well as the stuff appended by some freemail services.

So let's have a piece of code which has a catalog of these and can (if the list-owner so chooses) strip them off if they appear in the last N lines of a message.

Anti-spam/anti-abuse in Mailman

(Suggested by rsk)

One of the general principles of anti-spam work is that abuse control is best done as close to the origination point as possible: thus, stopping spam at your network perimeter is better than stopping it in your firewall is better than stopping it in your MTA is better than stopping it in Mailman.

However...Mailman has localized knowledge that none of the others do, so it's time to craft some serious anti-abuse clue into it -- not just to defend against incoming spam, but against *outgoing* spam, so that list operators don't find their operations blacklisted. What's needed is a flexible piece of code that holds dubious messages for moderation.

An example: freemail account hijackers are sending out messages where the entire body consists solely of a URL. That's an easy test and will stop a lot of spam that will otherwise pass, even on subscriber-only lists (because it's coming from a subscriber's account). Another example: "test" messages. They're either spam or annoying. Another example: messages whose Subject line matches certain key phrases (which I won't include here, in case your incoming mail is content filtered -- mine's not, I don't use it). Another example: any message from often-forged bogus addresses like fbi@gmail.com.

So basically the idea is: have a laundry list of tests, run them sequentially, if there are any hits, hold the message and tell the right -owner about it. The good news is that this is simple and fast: most tests are a regexp match. Also the false positive rate on some of these is zero, as in 0.000. The bad news is that once in a while there will be a false positive, but if it's rare enough and if the message need only be owner-approved to go through, then no harm.

Why *not* do this in the MTA?

MTA doesn't know about lists and subscribers and patterns.
Some servers mix mailing lists and end user traffic and different kinds of anti-spam are appropriate for each.
Messages going *out* need much much more scrutiny than messages coming in -- much more important not to be a spam source/relay then to protect own addresses.
As Mailman gains mindshare, it will come under increasing scrutiny and attack. Now would be a good time to get in front of it. Waiting until the inevitable happens will be too late.
Mailing list owners often don't run the MTA.

Administrative email message log

(Suggested by rsk)

Add a configurable set of options that will instruct Mailman to keep a log of

(a) all incoming adminstrative messages and

(b) all outgoing administrative messages.

(a) can be kinda sorta done by configuring the MTA to send second copies elsewhere, but (b) can't...and this really should be done inside Mailman.

The log should consist of full copies of the messages in mbox format. This is useful for debugging, for dealing with attacks, AND for proving, when necessary, that someone really did opt-in or really did unsubscribe. (By administrative messages I mean anything to -request, -owner, etc., as well as outbound subscription confirmation requests.)

Better content-filtering/handling error messages

(Suggested by rsk)

When messages have content stripped, make sure that the log includes the Message-ID, the MIME type, the file extension, the rule that fired, etc. so that if this was not what was intended, it's much easier to debug.

Not-fully-defined Project Ideas

These are some ideas that are not as well-defined, kept here for inspiration. If you want to discuss any of these ideas, please send email to mailman-developers@python.org.

Fixing bugs/adding features from the bug queues All of the components of Mailman have reasonably-maintained bug queues. It would be possible to build a GSoC project around a few wishlist items or desired bug fixes if none of the above projects appeals to you. Do make sure to talk to the mailman-developers list to find out if any given set of bugs would be suitable.
- The bug queue for Mailman core is here: https://bugs.launchpad.net/mailman/
- The bug queue for Postorius (the web UI) is here: https://bugs.launchpad.net/postorius/
- The bug queue for Hyperkitty (the archiver) is here: https://fedorahosted.org/hyperkitty/