r45387 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r45386‎ | r45387 | r45388 >
Date:02:29, 4 January 2009
Author:vyznev
Status:reverted (Comments)
Tags:
Comment:
Add special case handling of the XHTML character entity "'" to normalizeEntity() and decodeEntity(). This should resolve the remainder of bug 14365.
It might seem cleaner to just add the appropriate entry to $wgHtmlEntityAliases, but this would break decodeEntity() as currently written. Explicitly note this in the comments.
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/Sanitizer.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/Sanitizer.php
@@ -59,6 +59,9 @@
6060 /**
6161 * List of all named character entities defined in HTML 4.01
6262 * http://www.w3.org/TR/html4/sgml/entities.html
 63+ * This list does *not* include ', which is part of XHTML
 64+ * 1.0 but not HTML 4.01. It is handled as a special case in
 65+ * the code.
6366 * @private
6467 */
6568 global $wgHtmlEntities;
@@ -318,6 +321,7 @@
319322
320323 /**
321324 * Character entity aliases accepted by MediaWiki
 325+ * XXX: decodeEntity() assumes that all values in this array are valid keys to $wgHtmlEntities
322326 */
323327 global $wgHtmlEntityAliases;
324328 $wgHtmlEntityAliases = array(
@@ -954,7 +958,7 @@
955959 * encoded text for an attribute value.
956960 *
957961 * See http://www.w3.org/TR/REC-xml/#AVNormalize for background,
958 - * but note that we're not returning the value, but are returning
 962+ * but note that we are not returning the value, but are returning
959963 * XML source fragments that will be slapped into output.
960964 *
961965 * @param string $text
@@ -1032,6 +1036,8 @@
10331037 return "&{$wgHtmlEntityAliases[$name]};";
10341038 } elseif( isset( $wgHtmlEntities[$name] ) ) {
10351039 return "&$name;";
 1040+ } elseif( $name == 'apos' ) {
 1041+ return "'"; // "'" is valid in XHTML, but not in HTML4
10361042 } else {
10371043 return "&$name;";
10381044 }
@@ -1133,6 +1139,8 @@
11341140 }
11351141 if( isset( $wgHtmlEntities[$name] ) ) {
11361142 return codepointToUtf8( $wgHtmlEntities[$name] );
 1143+ } elseif( $name == 'apos' ) {
 1144+ return "'"; // "'" is not in $wgHtmlEntities, but it's still valid XHTML
11371145 } else {
11381146 return "&$name;";
11391147 }
Index: trunk/phase3/RELEASE-NOTES
@@ -466,6 +466,8 @@
467467 local URLs
468468 * (bug 16376) Mention in deleteBatch.php and moveBatch.php maintenance scripts
469469 that STDIN can be used for page list
 470+* Sanitizer::decodeCharReferences() now decodes the XHTML "'" character
 471+ entity (loosely related to bug 14365)
470472
471473
472474 === API changes in 1.14 ===

Follow-up revisions

RevisionCommit summaryAuthorDate
r45477Revert r45387 "Add special case handling of the XHTML character entity "&apos...brion02:31, 7 January 2009
r89681Followup to r86061: add parser test case to confirm that '&apos' in wikitext ...brion20:11, 7 June 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r44370(bug 14365) skip invalid titles in RepoGroup::findFiles()vyznev23:20, 9 December 2008

Comments

#Comment by Brion VIBBER (talk | contribs)   02:32, 7 January 2009

Reverted in r45477 -- this special casing doesn't make any sense to me. Why not just add it to $wgHtmlEntities if it's valid XHTML 1?

Status & tagging log