Developers’ Weblog

Sponsored by
HostEurope Logo

Developers’ Weblog

All 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

POSIX locale tracking coming soon

Tags: mksh plan snapshot

I’ve just committed a change to /etc/profile that sets LC_ALL=C.UTF-8 as the default locale. We used to set LC_CTYPE=en_US.UTF-8 which was a little friendlier when forwarded over ssh, but that 2009 proposal of mine is spreading and we standardise on it now. cleanenv now also sets it in “clean fully” modes (i.e. without dash or slash as first argument) and I expect more to follow.

In a next step libc will have a binary toggle between C and C.UTF-8 (somewhat again), locale(1) and setlocale(3) corresponding. mksh will implement full locale tracking (for systems without setlocale, C will be the “implementation-specified default locale”, and I think we’ll have the same for MirBSD libc; there’s talk in… Debian or glibc? to switch it to C.UTF-8 but AFAIK that’s not been tested yet, and the locale upon entering main is mandated to be C anyway so we won’t really gain much except, perhaps, confusion).

I may add a double build where processes that would now be run under C locale warn (per syslog or so) to detect that since as of currently MirBSD has only the C.UTF-8 locale. (This is a problem, but which has been proven to be one only recently.)

lksh(1) will still consider POSIX compliance only for the C locale, but turning on POSIX mode may no longer turn off UTF-8 mode as the locale environment variables are the then-only determining factor. (Manually toggling set ±U will of course still work.) In the same vain presence of the BOM may not affect the UTF-8 mode flag any longer either.

It’ll be a bumpy ride, especially for MirBSD itself, but we’ll sure manage. For mksh(1), it’ll be R60, which will be a real major release, carrying more deep changes. Removal of the cat(1) and sleep(1) builtins is already done, Debian bullseye already carries the early (originally done for SuSE) locale tracking, and users request full 21-bit UCS which R60 is certainly a very good poing in time to implement it.

Update 2021-04-08: mksh(1) as shipped in Debian 11 “bullseye” will already implement locale tracking, even though some more changes, such as the BOM handling removal, have not made the cut yet (mostly because I’m testing the changes excessively first).

Some upheaval continues: I’m still working (in the background) on porting latest OpenSSH, and we’ll direly need a newer SSL library. It will probably turn out to be LibreSSL, despite all the trouble with LibreSSL OpenSSL is dead (the illegal licence change for 3.0 isn’t acceptable as the chosen new licence not only is less free, it’s also incompatible with the GPLv2 just as the old licence!), and the other forks are even more questionable, fragile, whether workable with a BSD at all. Time will tell.

Thankfully at least the recent sudo(8) issue did not hit. But given sudo was already at version 1.4 in 1996(!), and after having backported the fixes to some old Debian releases and derivatives, I understand why OpenBSD threw it all away and wrote doas.

MirBSD Logo