r49654 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r49653‎ | r49654 | r49655 >
Date:23:48, 19 April 2009
Author:skizzerz
Status:resolved (Comments)
Tags:
Comment:
Follow-up on r49330
* re-add $wgRestrictDisplayTitle
* revert r49610
* prevent block-level and other such tags from being used in DISPLAYTITLE (while still allowing tags such as <sup> and <sub>)
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/DefaultSettings.php (modified) (history)
  • /trunk/phase3/includes/Sanitizer.php (modified) (history)
  • /trunk/phase3/includes/parser/CoreParserFunctions.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/parser/CoreParserFunctions.php
@@ -236,13 +236,25 @@
237237 * @param string $text Desired title text
238238 * @return string
239239 */
240 - static function displaytitle( $parser, $displayTitle = '' ) {
 240+ static function displaytitle( $parser, $text = '' ) {
 241+ global $wgRestrictDisplayTitle;
 242+
 243+ #list of disallowed tags for DISPLAYTITLE
 244+ #these will be escaped even though they are allowed in normal wiki text
 245+ $bad = array( 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'div', 'blockquote', 'ol', 'ul', 'li',
 246+ 'table', 'tr', 'th', 'td', 'dl', 'dd', 'caption', 'p', 'ruby', 'rb', 'rt', 'rp' );
 247+
241248 #only requested titles that normalize to the actual title are allowed through
242249 #mimic the escaping process that occurs in OutputPage::setPageTitle
243 - $title = Title::newFromText( Sanitizer::stripAllTags( Sanitizer::normalizeCharReferences( Sanitizer::removeHTMLtags( $displayTitle ) ) ) );
 250+ $text = Sanitizer::normalizeCharReferences( Sanitizer::removeHTMLtags( $text, null, array(), array(), $bad ) );
 251+ $title = Title::newFromText( Sanitizer::stripAllTags( $text ) );
244252
245 - if ( $title instanceof Title && $title->getFragment() == '' && $title->equals( $parser->mTitle ) ) {
246 - $parser->mOutput->setDisplayTitle( $displayTitle );
 253+ if( !$wgRestrictDisplayTitle ) {
 254+ $parser->mOutput->setDisplayTitle( $text );
 255+ } else {
 256+ if ( $title instanceof Title && $title->getFragment() == '' && $title->equals( $parser->mTitle ) ) {
 257+ $parser->mOutput->setDisplayTitle( $text );
 258+ }
247259 }
248260
249261 return '';
Index: trunk/phase3/includes/Sanitizer.php
@@ -338,9 +338,11 @@
339339 * @param string $text
340340 * @param callback $processCallback to do any variable or parameter replacements in HTML attribute values
341341 * @param array $args for the processing callback
 342+ * @param array $extratags for any extra tags to include
 343+ * @param array $removetags for any tags (default or extra) to exclude
342344 * @return string
343345 */
344 - static function removeHTMLtags( $text, $processCallback = null, $args = array(), $extratags = array() ) {
 346+ static function removeHTMLtags( $text, $processCallback = null, $args = array(), $extratags = array(), $removetags = array() ) {
345347 global $wgUseTidy;
346348
347349 static $htmlpairs, $htmlsingle, $htmlsingleonly, $htmlnest, $tabletags,
@@ -377,8 +379,10 @@
378380 'li',
379381 );
380382
381 - $htmlsingleallowed = array_merge( $htmlsingle, $tabletags );
382 - $htmlelements = array_merge( $htmlsingle, $htmlpairs, $htmlnest );
 383+ $htmlsingleallowed = array_unique( array_merge( $htmlsingle, $tabletags ) );
 384+ # Only allow elements that aren't specified in $removetags
 385+ # Doing it here since this is the top-level check
 386+ $htmlelements = array_diff( array_unique( array_merge( $htmlsingle, $htmlpairs, $htmlnest ) ), $removetags );
383387
384388 # Convert them all to hashtables for faster lookup
385389 $vars = array( 'htmlpairs', 'htmlsingle', 'htmlsingleonly', 'htmlnest', 'tabletags',
Index: trunk/phase3/includes/DefaultSettings.php
@@ -3463,6 +3463,11 @@
34643464 $wgAllowDisplayTitle = true;
34653465
34663466 /**
 3467+ * for consistency, restrict DISPLAYTITLE to titles that normalize to the same canonical DB key
 3468+ */
 3469+$wgRestrictDisplayTitle = true;
 3470+
 3471+/**
34673472 * Array of usernames which may not be registered or logged in from
34683473 * Maintenance scripts can still use these
34693474 */
Index: trunk/phase3/RELEASE-NOTES
@@ -26,15 +26,12 @@
2727 * Added $wgNoFollowDomainExceptions to allow exempting particular domain names
2828 from rel="nofollow" on external links
2929 * (bug 12970) Brought back $wgUseImageResize.
30 -* Added $wgRedirectOnLogin to allow specifying a page to redirect users to upon
31 - logging in (for example, "Main Page")
 30+* Added $wgRedirectOnLogin to allow specifying a specifc page to redirect users
 31+ to upon logging in (ex: "Main Page")
3232 * Add $wgExportFromNamespaces for enabling/disabling the "export all from
3333 namespace" option (disabled by default)
3434 * (bug 18222) $wgMinimalPasswordLength default is now 1
3535 * $wgSessionHandler can be used to configure session.save_handler
36 -* Removed $wgRestrictDisplayTitle, in effect permanently setting it to true.
37 - Without this variable, the DISPLAYTITLE magic word will only accept titles
38 - that are equivalent to the actual page title.
3936
4037 === New features in 1.15 ===
4138

Follow-up revisions

RevisionCommit summaryAuthorDate
r49720* Follow up r49654: move the $extratags and $removetags processing outside of...skizzerz21:54, 21 April 2009

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r49330(bug 12998) Allow <sup>, <sub>, etc. in DISPLAYTITLErememberthedot05:15, 9 April 2009
r49610Follow-up to r49330: removed $wgRestrictDisplayTitle from DefaultSettings.phprememberthedot00:56, 18 April 2009

Comments

#Comment by Remember the dot (talk | contribs)   00:32, 20 April 2009

removeHTMLtags now catches invalid tags in the title <h1>, which is good, but other <h1>s in the page are still not being sanitized. For example:

<h1>
    <ol>
        <li>Evel</li>
        <li>Knievel</li>
    </ol>
</h1>

Also...why are 'ruby', 'rb', 'rt', and 'rp' in the list of bad tags? These are not valid HTML tags, so the parser should automatically escape them without being asked.

#Comment by Skizzerz (talk | contribs)   00:56, 20 April 2009

ruby and the like were inside the Sanitizer method that allows specific HTML tags, and they were on the allowed list. Also, they are indeed valid HTML tags, it's just that many browsers don't support them.

As for the h1 in wiki text, that's an issue with the Parser and not with the displaytitle function, so it's unrelated (although it now should be fixable)

#Comment by Remember the dot (talk | contribs)   01:06, 20 April 2009

The HTML specification (http://www.w3.org/TR/html401/index/elements.html) doesn't list any Ruby tags. These custom tags should not be making it through the sanitizer.

I had thought that the most logical way to do h1 sanitation would be to give removeHTMLtags context information (like "block" or "h1") and have it do the removal as though it had encountered an h1 tag within the page body, perhaps with some additional restrictions. That way, we wouldn't be duplicating the list of tags forbidden in h1 in both the displaytitle function and the removeHTMLtags function.

#Comment by Brion VIBBER (talk | contribs)   20:41, 28 April 2009

http://www.w3.org/TR/ruby/

Currently only MSIE supports it that I know of, which kind of sucks. :)

#Comment by Nikerabbit (talk | contribs)   06:40, 21 April 2009
#Comment by Skizzerz (talk | contribs)   21:56, 21 April 2009

Fixed in r49720

Status & tagging log