#pragma page-filename COM/versions/646
= Pickles Begone =

For a while now I've been working on a 
[[http://mailman.svn.sourceforge.net/svnroot/mailman/branches/tmp-sqlalchemy-branc|branch]] of the main Mailman svn trunk. The primary goal of this branch was to see if we could save all list data in a [[http://www.sqlalchemy.org|SQLAlchemy]] backed database instead of all the little individual list-specific pickles that has been a signature of Mailman's "database" almost since its inception. My experiments were rather modest – I wanted to see how little code I had to change in order to get rid of the pickles altogether.

It turns out that really, not much code had to be changed. I think this is a testiment to SQLAlchemy's approach. I had made several attempts at switching to an ORM before, once with [[http://www.sqlobject.org|SQLObject]] and once with SQLAlchemy. In fact, during the last couple of [[http://www.pycon.org|PyCons]] in Washington DC, a number of us sprinting on this very thing. Our goals were much more ambitious though; we tried to layout an elaborate schema for the ultimate Mailman 3 database. In retrospect, this was a good exercise, but because the changes were so radical, the work was never completed.

My previous attempt using SQLAlchemy failed because I was still trying to keep the list-centric approach. IOW, I was trying to keep one database per list, but this didn't work for several reasons. This time around I decided to simplify things by having one database contain all data for all mailing lists. I've had enough success with this that I'm ready to merge my branch back into the trunk.

== Compromises ==

While the branch seems to work pretty well – you can certainly make changes in the web u/i and they persist, and you can send mail through the system – it hasn't been thorougly tested. My opinion is that it won't get more testing in the branch and it's stable enough, so it's time to merge.

I've made some compromises in the schema design in order to get things working. For example, any !MailList attribute that isn't of a simple type (e.g. not a string or number) is defined in the SQLAlchemy schema as a `PickleType` column. Okay, so pickles aren't totally eradicated! Still, this was the quickest path to getting everything working.

For example, the list of email address of the list's owners is stored as a Python list of strings on the attribute. Rather than DTRT now and represent those as its own table, I just make the `MailList.owner` attribute a `PickleType` and SQLAlchemy does the rest. My plan after merging to the trunk is to take these pickle columns one-by-one and split them out into their own tables.

== Still to do ==

I think the very first thing to do after the merge is to build functional import/export scripts. Actually, we already have an export script for the trunk, but this must be ported to the MM2.1 branch. The way existing lists are going to be migrated to the trunk is by exporting an XML file representing one or more mailing lists, then taking those XML files over to a MM2.2 (or 3.0???) site and doing an import. ISTM there's no other way.

Once I have a working import/export, I plan on rewriting the ''pending'' database to be SQLAlchemy'd instead of the pickle it currently is. After that, I will work on individual `PickleType` columns, to de-picklify them too. My top priority is to implement the ''rosters'' idea I've had for a long time. This will be more disruptive to the code base but it will also have the greatest impact on usability I think.
----

<<IncludeComments>>