Managing Translations

See Bug 1414154 in Launchpad for newer discussion of this. We must have this figured out for the Mailman suite 3.0 release.

Until now, translations have been managed in a bit of a haphazard way, with me Barry often as the bottleneck. Sometimes people email diffs or tarballs, sometimes they post patches on the trackers, and some translators have write access to the source repository. There are other problems with the current structure of i18n management, and I would like to improve the situation. This page attempts first to collect requirements, options, and discussions about how to better management translations, both for the core software developers and for the translators.

What I do not want to do is dictate a particular solution. As the lead developer, I need certain things, but I am not a translator (I'm a typical monolinguistic American, despite years of high school Spanish ). I want to choose a solution that makes translators lives easier, while still meeting a certain minimum of requirements on the software development side.

Clytie : as a translator, I really appreciate your willingness to meet us half-way. We suffer a good deal from obstacles of different kinds to translation in different projects, so facilitating the process will make for more participation and better translations.

Software development requirements

Here, in no particular order are my requirements:

  • More separation between the software development schedule and the translation schedule. By this I mean, I would like a system whereby I can upload new mailman.pot files whenever we have a bunch of new strings in the code. Then I would like to be able to push a button just before a new release to get all the latest translated mailman.po files.
    Clytie: no reason why you couldn't build this in via gettext, which is designed for that kind of function. And I agree that it's important to get the strings out there for translation as quickly as possible. Also, the .pot file (and updated templates, if that structure continues) need to be easily accessible.

  • One clearly documented mechanism for providing new languages and updates. Let's figure out what we're going to do and then get everyone on the same page.
    Clytie: yes indeed! Simple, straightforward and effective. Mailing-list managers want something consistent, that they can factor into their workflow. They evaluate software based not only on how good it is, but how well it works with the rest of their system.

  • Translations independent of Mailman version. I want to be able to have a single mailman.po file that contains the union of all strings from say, Mailman 2.1 and 2.2, so that translators can just work on texts without regard to which version it will go in. This will make it easier for me to integrate updates into whatever version is getting released next.
    Clytie: Hmm... since some features change from one version to another, this might not be such a good idea. We are supposed to translate the strings as part of a coherent process. It would be more confusing (since mailman.po is currently completely bare of context), to have strings from different versions dealing with the same situations differently. I think an integrated way of handling updates would work better. We are accustomed to working with versions: updating is part of our workflow. You can certainly use gettext to create an overall PO file (compendium) of all translations for a certain language, regardless of version, but it's not a good idea to ask us to translate such a possibly incoherent file. The quality of translation would be significantly lower.

  • Management of legal issues. Mailman is GPL'd software with copyrights owned by the FSF. As such, we need certain legal issues to be addressed by anyone who wants to donate translations. Maybe you will need to sign a copyright assignment or disclaimer. I would like some external process to manage this so I don't have to. E.g. Rosetta has a GNU Translators Team which requires the proper paperwork before donations can be accepted. This ensures that projects availing themselves of such translations don't have to worry about getting the paperwork signed before they can accept the translation.
    Clytie: you can build this into the translation system you choose. If you choose the TP, you simply stipulate that you want a FSF disclaimer signed: most files there do. You can't really translate for the TP unless you have signed this disclaimer. Your files will not be accepted. There are a few files which don't require the disclaimer, but they are the only ones that will be accepted without it. It is the standard. On Rosetta, you can achieve this by limiting participation to the GNU Translators' Group. On Pootle, you can set any constraints that you like: the disclaimer would be built into the process, simply by stipulating it as a pre-requisite for participation in translation. It's important to remember that many translators from other projects may already have logged this disclaimer: make it a pre-requisite, not a step they have to go through again.

  • One contact person per language. While I sincerely hope to encourage lots of participation (and plan on acknowledging all those who contribute, if possible), I would like one leader, or champion to whom I can go to for questions, issues, or problems. I often get patches or bug reports sent to me directly from users, and often there's little I can do about it, because I do not speak the language. I would like to have one person per language who will lead the effort.
    Clytie: I agree. It's the usual structure in translation projects, with a backup person logged with the overall co-ordinator in case of absence etc. A hierarchy of information distribution and application needs to be established, or you will end up with parallel and diverging events.

  • More community involvement. Rather that have just people who are interested in Mailman, I would love to tap into the large resource of people who are translating open source in general. I think we can get more languages supported if we broaden our horizons.
    Clytie: most definitely. Link Mailman to the overall community, via the TP, Pootle or Rosetta, and you start using the established network. You can also ask language-team leaders to encourage other translators at their other projects, to participate. Translators routinely cross project boundaries, so we are effective information disseminators and recruiters.

  • I'm not the sysadmin!. Sadly, much as I love being a sysadmin <wink>, I don't have time to administer an i18n system. Which means whatever we choose must be hosted elsewhere, or we need a machine and volunteers to keep things running.

Note that one other thing I would really like is an overall Translation Coordinator. This would be one person who would help coordinate all the individual languages, be Mailman's official interface to whatever translation organization we choose, help recruit new translators and languages, and help to socialize the policies and mechanisms we will use to translate Mailman. Are you that person? Email me <barry NO AT HARVEST python DOT org>.
Clytie: if nobody else is really keen to do it, I'm willing to give it my best. But please, if you want to do it, don't let me discourage you. Your effort is very welcome, and I have plenty of other things to do! I just want to see Mailman i18n doing as well as its code does.

Translators requirements

Here's where you fill in your thoughts. I'm not a translator so I don't know much about what you need to be more effective.

Clytie: above all, we need a straightforward, effective system that is as simple as possible. As soon as we start adding expectations, we start losing translators. There are a great many linguistically-talented people who don't have added computing skills. This means that something like Pootle, which takes care of all the underlying process and simply provides strings to translate, is such a powerful choice. With your own configuration, it encourages translators to use even small amounts of time to do a few strings at a time. This approach has worked extremely well for the Distributed Proofreaders project, the text-production arm of Project Gutenberg: its slogan is "Every string helps!" (or similar), and the results have been staggering. Rosetta has seen some of that compound participation, but it doesn't yet have IMHO enough configuration power to setup individual projects well, so quality and management have been a problem.

Of the non-common-interface translation projects, the TP (Translation Project) is the simplest to use. Translators, once they are registered and have logged their disclaimer, simply download files, translate them, then submit them via email. A robot program runs checks over the submitted file, and sends it back with explanations if it is not entirely correct. Developers also submit their files to this robot program. The TP is the most efficient i18n project, IMHO, and if you simply want your files translated, you couldn't do better.

However, from the simplicity point of view, the TP may be the easiest to use, but it requires that you understand PO files, and can use gettext. These things may seem very simply to people who have been using them for some time, but we want to attract more translators: I only started translating Mailman quite recently, and in that time we have had at least three new translators arrive who have no previous experience with PO format. So, if we are going to continue PO-based, we need to build in some support for newish translators. Our howto helps, but there definitely needs to be a mentoring process. The dropout rate for new translators is high in OSS: if we want to keep our new translators, we need to look after them. I'd be interested to hear from some newer translators on this: I do remember that the projects to which I have contributed most, are those which welcomed me when I started, and offered me good support.

All other i18n projects add layers of complexity on the basic PO file translation process. Some use non-standard syntax, which is a nightmare; others require specific procedures which eat up a lot of time and effort, especially when one starts a new project. Again, those projects lose a lot of translators. Only the *translation team* has really kept translators in OSS, by building a language/culture-specific support structure, usually based around one or two team-leaders and a lot of personal effort. If we provide for the lanuage-team structure, even better, communicate with existing language teams and ask them to add Mailman to their list, we plugin to an established and successul support system. However, the real work in i18n, as in the rest of OSS, is done by a surprisingly small number of people. Every new translator we can attract and keep, is a great gain. The work we put into support now, will bring lasting and possibly compound results.

Projects like Gnome, KDE, Debian and Xfce, even other individual-program projects like Psi and Gaim, all build their own procedure on top of standard PO files (the suite projects Mozilla and OpenOffice.org make life difficult by using their own format), access via SVN (Gnome is moving from CVS to SVN as soon as they can get python sorted out on their server) and a system of status pages which are IMHO their greatest achievement in i18n. Please see the KDE status pages for interface files (they also have these for documentation). If we are restructuring the Mailman i18n process, I would strongly recommend having a status page that shows progress for each language (better still, for each file for each language). These pages are useful for everyone concerned, and they're extremely motivating. You can see exactly where you are, and get almost immediate feedback on your progress, when you submit files. These pages are updated at least once daily, usually three times.

(I've sent Barry some info on the current status of the status-pages software.)

We can also learn from the TP, who ask translators to register if they want to receive auto-update emails. These emails are sent out as soon as a file is updated. This may motivate translators who don't follow mailman-i18n as closely as they should. In any event, they work to keep in touch with each translator, and avoid people "falling off" the end of the process. If you don't have time to catch up with your mailing lists (I haven't got to mine today yet!), you do check your main Inbox (where I found the mail about this page), so you'll see the update mails. I do my email updates before I do any of my other tasks, and I think other translators would also give them priority in their workflow. It's efficient.

Debian is now following this process for debconf files, and has also started sending out a status email for each level of the debian-installer. It's in diff form, currently, which I think needs to be more people-readable. I've found this very handy already.

I like the translate-a-thon idea, also building more community, more communcation (forum? Jabber, IRC), having targets and goals we establish for ourselves. We can make this a lot more fun than it currently is (again, Distributed Proofreaders are a great example).

We definitely need an integrated system for getting and submitting translation files. We need to look at each stage of our i18n process and review what works now, and what needs improving, and how we can do it. There is a lot we can do to improve it, and first of all we need to hear what translators think of the current system, and what they would suggest we do. You've certainly heard from me, here, off the top of my head.

What do you think?

Translation projects

Here are the list of open source translation projects that we could use to manage Mailman's i18n. If you know about others, please add them here. If you have experience with them and can provide pros and cons, please do so, as comments on this page.

  • One option of course is to keep the current status quo. It's not a good option, but it's an option

  • Translatewiki.net: Use of key-based file formats is preferred, like Java properties and Ruby-style YAML. Other supported file formats are PHP arrays, PHP variables and Gettext. Let us know if you have a different format in mind; in our experience adding file format support is relatively easy for someone proficient in PHP.

  • The Translation Project

  • Launchpad's Rosetta

  • Pootle and translate.sourceforge.net

It depends on whether you want to plugin your files, or run your own project. I don't know of any project that does what the TP or Pootle do, as well as they do. There are pale imitations, but that's all. I'm uncomfortable with the present lack of configuration and quality assurance on Rosetta. Pootle is state-of-the-art in that technology, and it already provides a great deal more configurable control over the project and individual translations.

Additional past conversations

MailmanWiki: DEV/Managing Translations (last edited 2015-03-04 05:45:31 by msapiro)