Remove most named character references from output
Recommit of
r66254 to trunk. This was just
find extensions phase3 -iname '*.php' \! -iname '*.i18n.php' \! -iname 'Messages*.php' \! -iname '*_Messages.php' -exec sed -i 's/ /\ /g;s/—/―/g;s/•/•/g;s/á/á/g;s/´/´/g;s/à/à/g;s/α/α/g;s/ä/ä/g;s/ç/ç/g;s/©/©/g;s/↓/↓/g;s/°/°/g;s/é/é/g;s/ê/ê/g;s/ë/ë/g;s/è/è/g;s/€/€/g;s/↔//g;s/…/…/g;s/í/í/g;s/ì/ì/g;s/←/←/g;s/“/“/g;s/·/·/g;s/−/−/g;s/–/–/g;s/ó/ó/g;s/ô/ô/g;s/œ/œ/g;s/ò/ò/g;s/õ/õ/g;s/ö/ö/g;s/£/£/g;s/′/′/g;s/″/″/g;s/»/»/g;s/→/→/g;s/”/”/g;s/Σ/Σ/g;s/×/×/g;s/ú/ú/g;s/↑/↑/g;s/ü/ü/g;s/¥/¥/g' {} +
followed by reading over every single line of the resulting diff and
fixing a whole bunch of false positives. The reason for this change is
given in <
http://lists.wikimedia.org/pipermail/wikitech-l/2010-April/047617.html>.
I cleared it with Tim and Brion on IRC before committing. It might
cause a few problems, but I tried to be careful; please report any
issues.
I skipped all messages files. I plan to make a follow-up commit that
alters wfMsgExt() with 'escapenoentities' to sanitize all the entities.
That way, the only messages that will be problems will be ones that
output raw HTML, and we want to get rid of those anyway.
This should get rid of all named entities everywhere except messages. I
skipped a few things like   that I noticed in manual inspection,
because they weren't well-formed XML anyway.
Also, to everyone who uses non-breaking spaces when they could use a
normal space, or nothing at all, or CSS padding: I still hate you. Die.