r96655 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r96654‎ | r96655 | r96656 >
Date:11:28, 9 September 2011
Author:catrope
Status:ok (Comments)
Tags:
Comment:
Commit live hack: pass XML_PARSE_HUGE (code uses 1 << 19 because the constant isn't available for some reason) into DOMDocument::loadXML() if the first call to loadXML() failed. This prevents newer versions of libxml2 from throwing a warning and messing up when the XML contains structures that are nested more than 256 levels deep. RELEASE-NOTES added to the 1.18 file, tagging this for backporting to 1.18 too.

We at Wikimedia never noticed this issue until we upgraded libxml2 on one of our servers as part of an OS upgrade, but apparently the interwebs knew about this since at least May 2010. Hat tip
to http://deriksmith.livejournal.com/57617.html , where I found this fix.
Modified paths:
  • /trunk/phase3/RELEASE-NOTES-1.18 (modified) (history)
  • /trunk/phase3/includes/parser/Preprocessor_DOM.php (modified) (history)

Diff [purge]

Index: trunk/phase3/RELEASE-NOTES-1.18
@@ -444,6 +444,8 @@
445445 #REDIRECT [[Foo]] is invalid JS
446446 * Tracking categories are no longer shown in footer for special pages
447447 * $wgOverrideSiteFeed no longer double escapes urls.
 448+* The preprocessor no longer fails with a PHP warning about XML_PARSE_HUGE when
 449+ processing complex pages using newer versions of libxml2.
448450
449451 === API changes in 1.18 ===
450452 * BREAKING CHANGE: action=watch now requires POST and token.
Index: trunk/phase3/includes/parser/Preprocessor_DOM.php
@@ -155,7 +155,8 @@
156156 if ( !$result ) {
157157 // Try running the XML through UtfNormal to get rid of invalid characters
158158 $xml = UtfNormal::cleanUp( $xml );
159 - $result = $dom->loadXML( $xml );
 159+ // 1 << 19 == XML_PARSE_HUGE, needed so newer versions of libxml2 don't barf when the XML is >256 levels deep
 160+ $result = $dom->loadXML( $xml, 1 << 19 );
160161 if ( !$result ) {
161162 throw new MWException( __METHOD__.' generated invalid XML' );
162163 }

Follow-up revisions

RevisionCommit summaryAuthorDate
r966561.17wmf1: MFT r96655, which was a commit of a live hack to begin withcatrope11:32, 9 September 2011
r96849REL1_18: r96509, r96522, r96606, r96643, r96645, r96655, r96659, r96687, r967......reedy15:03, 12 September 2011

Comments

#Comment by Hashar (talk | contribs)   14:14, 9 September 2011

Related to PHP bug: https://bugs.php.net/bug.php?id=49660

The bug appear with libxml2.7.3+ and was fixed in PHP libxml extension as of:

version 5.3.2 - 04-Mar-2010
version 5.2.12 - 17-Dec-2009

Look for PARSEHUGE in PHP changelog: http://php.net/ChangeLog-5.php

Preprocessor_DOM could define the constant itself if it does not exist.

#Comment by Platonides (talk | contribs)   21:50, 9 September 2011

Heh, I remember wondering why I had needed XML_PARSE_HUGE while wikipedia worked without it.

Status & tagging log