Developers’ Weblog

Sponsored by
HostEurope Logo

Developers’ Weblog

⚠ This page contains old, outdated, obsolete, … historic or WIP content! No warranties e.g. for correctness!

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

How (not to) encode MIME headers

2011-05-26 by (EvolvisForge blog)
Tags: work debian

I was just tracking down why some mails seem to have garbled Subjects.

It looked like this in Alpine:
Subject: [BOFH commits] r2040: fix directory struct =?UTF-8?Q?ure=E2=86=B5=20unix?=/mirror/=?UTF-8?Q?=20=E2=86=92=20mirror?=. bonn

The raw header sight was this:
Subject: [BOFH commits] =?utf-8?q?r2040=3A__fix_directory_struct__=3D=3FUT?=

Ein Schelm, wer Böses dabei denkt… First suspect was, of course, Mailman — after decoding, this showed the classical signs of a double-encode. (After ruling out general header brokenness, but no, 76 chars is ok.) I thus hand-crafted an eMail with the correct header line and sent that out:
Subject: [BOFH commits] =?utf-8?q?r2040=3A_fix_directory_structure?=

Huh? Mailman and Python weren’t the culprit, thus — this is correct mangling. Okay, let’s dive into the Perl code that actually sends out the eMails. To make a long story short, have a look at this, then RFC 2047:
tglase@tglase:~ $ perl -MEncode -e '$subject = "r2040: fix directory structure↵ unix/mirror/ → mirror.bonn"; Encode::from_to($subject, "UTF-8", "MIME-Q"); print "{Subject: ".$subject."}\n";'
{Subject: r2040:=?UTF-8?Q?=20fix=20directory=20struct?=

Amazingly enough, PHP’s mb_encode_mimeheader, despite being talked to trash in the comments on its online documentation, does manage to get it right:
tglase@tglase:~ $ php -r 'mb_internal_encoding("UTF-8");echo "{".mb_encode_mimeheader("Subject: r2040: fix directory structure↵ unix/mirror/ → mirror.bonn", "UTF-8", "Q")."}\n";'
{Subject: r2040: fix directory =?UTF-8?Q?structure=E2=86=B5=20unix/mirror/?=

Wow. Now, the Perl guys I know told me to use Perl’s Mail tools… which are much too high-level though, for all I have and want is the subject string and an RFC 822 header line. I told them I’m not above doing this, and so I did. The 3P languages can really be annoying.

Why’s Perl’s output wrong anyway? I don’t know for sure, but I think the atoms must be separated, so unquoting /mirror/ in the middle, with no spaces around it, are wrong. (Besides, Encode::from_to can’t do the job right anyway, as it misses the name of the header, which is included in the 76 chars allowed for the first line. BAD • Broken As Desdigned.)

Disclaimer: I don’t really know any Perl, I fight my way through PHP and, barely, Python. (But I can code.)

MirOS Logo