Last week, I blogged about my efforts to fight mailman subscription spam. Enabling
SUBSCRIBE_FORM_SECRET as described there indeed helped to drastically reduce
the amount of subscription spam from more than 1000 to less than 10 mails sent
per day, but some attackers still got through. My guess is that those machines
were just so slow that they managed to wait the required five seconds before
submitting the form.
So, clearly I had to level up my game. I decided to pull through on my plan to write a simple CAPTCHA for mailman (that doesn’t expose your users to Google). This post describes how to configure and install that CAPTCHA.
This simple CAPTCHA is based on a list of questions and matching answers that you, the site admin, define. The idea is to use questions that anyone who is interested in your site can easily answer. Since most sites are small enough that they are not to be targeted specifically, the bots (hopefully) will not be able to answer these questions. At least for my sites, that has worked so far (I am running with this patch for a week now).
The CAPTCHA requires
SUBSCRIBE_FORM_SECRET to be enabled. Configuration can
look something like this:
SUBSCRIBE_FORM_SECRET = "<some random string, generated e.g. by [openssl rand -base64 18]>" CAPTCHAS = [ # This is a list of questions, each paired with a list of answers. ('What is two times six?', ['12', 'twelve']), ('What is the name of this site's blog', ['Ralf's Ramblings']), ]
Right now, the
CAPTCHAS part of the configuration will not yet do anything
because you still have to install my patch. The patch is losely based on
this blog post and was
written against Mailman 2.1.23 on Debian 9 “Stretch”. If you are using a
different version you may have to adapt it accordingly.
First of all, create a new file
/usr/lib/mailman/Mailman/Captcha.py with the
import random from Mailman import Utils def display(mlist, captchas): """Returns a CAPTCHA question, the HTML for the answer box, and the data to be put into the CSRF token""" idx = random.randrange(len(captchas)) question = captchas[idx] box_html = mlist.FormatBox('captcha_answer', size=30) return (Utils.websafe(question), box_html, str(idx)) def verify(idx, given_answer, captchas): try: idx = int(idx) except ValueError: return False if not idx in range(len(captchas)): return False # Chec the given answer correct_answers = captchas[idx] given_answer = given_answer.strip().lower() return given_answer in map(lambda x: x.strip().lower(), correct_answers)
This contains the actual CAPTCHA logic. Now it needs to be wired up with the listinfo page (where the subscription form is shown to the user) and the subscription page (where the subscription form is submitted to).
Here is the patch for
--- listinfo.py.orig 2018-06-03 19:18:30.089902948 +0200 +++ listinfo.py 2018-06-10 19:12:59.381910750 +0200 @@ -26,6 +26,7 @@ from Mailman import mm_cfg from Mailman import Utils +from Mailman import Captcha from Mailman import MailList from Mailman import Errors from Mailman import i18n @@ -216,10 +220,16 @@ # drop one : resulting in an invalid format, but it's only # for our hash so it doesn't matter. remote = remote.rsplit(':', 1) + # get CAPTCHA data + (captcha_question, captcha_box, captcha_idx) = Captcha.display(mlist, mm_cfg.CAPTCHAS) + replacements['<mm-captcha-question>'] = captcha_question + replacements['<mm-captcha-box>'] = captcha_box + # fill form replacements['<mm-subscribe-form-start>'] += ( - '<input type="hidden" name="sub_form_token" value="%s:%s">\n' - % (now, Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET + + '<input type="hidden" name="sub_form_token" value="%s:%s:%s">\n' + % (now, captcha_idx, Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET + now + + captcha_idx + mlist.internal_name() + remote ).hexdigest()
And here the patch for
--- subscribe.py.orig 2018-06-03 19:18:35.761813517 +0200 +++ subscribe.py 2018-06-03 20:35:00.056454989 +0200 @@ -25,6 +25,7 @@ from Mailman import mm_cfg from Mailman import Utils +from Mailman import Captcha from Mailman import MailList from Mailman import Errors from Mailman import i18n @@ -144,13 +147,14 @@ # for our hash so it doesn't matter. remote1 = remote.rsplit(':', 1) try: - ftime, fhash = cgidata.getvalue('sub_form_token', '').split(':') + ftime, fcaptcha_idx, fhash = cgidata.getvalue('sub_form_token', '').split(':') then = int(ftime) except ValueError: - ftime = fhash = '' + ftime = fcaptcha_idx = fhash = '' then = 0 token = Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET + ftime + + fcaptcha_idx + mlist.internal_name() + remote1).hexdigest() if ftime and now - then > mm_cfg.FORM_LIFETIME: @@ -165,6 +169,10 @@ results.append( _('There was no hidden token in your submission or it was corrupted.')) results.append(_('You must GET the form before submitting it.')) + # Check captcha + captcha_answer = cgidata.getvalue('captcha_answer', '') + if not Captcha.verify(fcaptcha_idx, captcha_answer, mm_cfg.CAPTCHAS): + results.append(_('This was not the right answer to the CAPTCHA question.')) # Was an attempt made to subscribe the list to itself? if email == mlist.GetListEmail(): syslog('mischief', 'Attempt to self subscribe %s: %s', email, remote)
Finally, the HTML template for the listinfo page needs to be updated to show the CAPTCHA question and answer box.
On Debian, the templates for enabled languages are located in
The patch for the English template looks as follows:
--- /usr/share/mailman/en/listinfo.html 2018-02-08 07:54:28.000000000 +0100 +++ listinfo.html 2018-06-03 20:35:10.680275026 +0200 @@ -116,6 +116,12 @@ </tr> <mm-digest-question-end> <tr> + <TD BGCOLOR="#dddddd">Please answer the following question to prove that you are not a bot: + <mm-captcha-question> + </TD> + <TD><mm-captcha-box></TD> + </tr> + <tr> <td colspan="3"> <center><MM-Subscribe-Button></center> </td>
If you have other languages enabled, you have to translate this patch accordingly.
That’s it! Now bots have to be adapted to your specific questions to be able to
successfully subscribe someone. It is still a good idea to monitor the logs
/var/log/mailman/subscribe on Debian) to see if any illegitimate requests
still make it through, but unless you site is really big I’d be rather surprised
to see bots being able to answer site-specific questions.