r103795 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r103794‎ | r103795 | r103796 >
Date:01:45, 21 November 2011
Author:mah
Status:reverted (Comments)
Tags:
Comment:
Fixes Bug 31865 - Tag <dws> for discarding whitespaces.
Patch with parser tests from Van de Bugger
Modified paths:
  • /trunk/phase3/includes/parser/Preprocessor_DOM.php (modified) (history)
  • /trunk/phase3/includes/parser/Preprocessor_Hash.php (modified) (history)
  • /trunk/phase3/tests/parser/parserTests.txt (modified) (history)

Diff [purge]

Index: trunk/phase3/tests/parser/parserTests.txt
@@ -8930,6 +8930,97 @@
89318931 </p>
89328932 !! end
89338933
 8934+!! test
 8935+Bug 31865: HTML-style tag <dws> is recognized and discarded.
 8936+!! input
 8937+one<dws>two
 8938+!! result
 8939+<p>onetwo
 8940+</p>
 8941+!! end
 8942+
 8943+!! test
 8944+Bug 31865: XML-style tag <dws/> is recognized and discarded.
 8945+!! input
 8946+one<dws/>two
 8947+!! result
 8948+<p>onetwo
 8949+</p>
 8950+!! end
 8951+
 8952+!! test
 8953+Bug 31865: Spaces after <dws> tag are discarded.
 8954+!! input
 8955+one<dws> two
 8956+!! result
 8957+<p>onetwo
 8958+</p>
 8959+!! end
 8960+
 8961+!! test
 8962+Bug 31865: Tabs after <dws> tag are discarded too.
 8963+!! input
 8964+one<dws> two
 8965+!! result
 8966+<p>onetwo
 8967+</p>
 8968+!! end
 8969+
 8970+!! test
 8971+Bug 31865: Newlines after <dws> tag are discarded too.
 8972+!! input
 8973+one<dws>
 8974+
 8975+
 8976+two
 8977+!! result
 8978+<p>onetwo
 8979+</p>
 8980+!! end
 8981+
 8982+!! test
 8983+Bug 31865: Spaces before <dws> tag are not discarded.
 8984+!! input
 8985+one <dws>two
 8986+!! result
 8987+<p>one two
 8988+</p>
 8989+!! end
 8990+
 8991+!! test
 8992+Bug 31865: <dws> Continuation is indented.
 8993+!! input
 8994+one<dws>
 8995+ two
 8996+!! result
 8997+<p>onetwo
 8998+</p>
 8999+!! end
 9000+
 9001+!! test
 9002+Bug 31865: <dws> List item continuation.
 9003+!! input
 9004+* one<dws>
 9005+ two
 9006+* three
 9007+!! result
 9008+<ul><li> onetwo
 9009+</li><li> three
 9010+</li></ul>
 9011+
 9012+!! end
 9013+
 9014+!! test
 9015+Bug 31865: <dws/> XML-style; asterisk after the tag does not start list item.
 9016+!! input
 9017+* one <dws/>
 9018+* two
 9019+!! result
 9020+<ul><li> one * two
 9021+</li></ul>
 9022+
 9023+!! end
 9024+
89349025 TODO:
89359026 more images
89369027 more tables
Index: trunk/phase3/includes/parser/Preprocessor_Hash.php
@@ -153,6 +153,9 @@
154154 $ignoredElements = array( 'includeonly' );
155155 $xmlishElements[] = 'includeonly';
156156 }
 157+ // `dws' stands for "discard white spaces". `<dws>' and all the whitespaces afer it are
 158+ // discarded.
 159+ $xmlishElements[] = 'dws';
157160 $xmlishRegex = implode( '|', array_merge( $xmlishElements, $ignoredTags ) );
158161
159162 // Use "A" modifier (anchored) instead of "^", because ^ doesn't work with an offset
@@ -350,6 +353,17 @@
351354 }
352355
353356 $tagStartPos = $i;
 357+
 358+ // Handle tag dws.
 359+ if ( $name == 'dws' ) {
 360+ $i = $tagEndPos + 1;
 361+ if ( preg_match( '/\s*/', $text, $matches, 0, $i ) ) {
 362+ $i += strlen( $matches[0] );
 363+ }
 364+ $accum->addNodeWithText( 'ignore', substr( $text, $tagStartPos, $i - $tagStartPos ) );
 365+ continue;
 366+ }
 367+
354368 if ( $text[$tagEndPos-1] == '/' ) {
355369 // Short end tag
356370 $attrEnd = $tagEndPos - 1;
Index: trunk/phase3/includes/parser/Preprocessor_DOM.php
@@ -211,6 +211,9 @@
212212 $ignoredElements = array( 'includeonly' );
213213 $xmlishElements[] = 'includeonly';
214214 }
 215+ // `dws' stands for "discard white spaces". `<dws>' and all the whitespaces afer it are
 216+ // discarded.
 217+ $xmlishElements[] = 'dws';
215218 $xmlishRegex = implode( '|', array_merge( $xmlishElements, $ignoredTags ) );
216219
217220 // Use "A" modifier (anchored) instead of "^", because ^ doesn't work with an offset
@@ -406,6 +409,20 @@
407410 }
408411
409412 $tagStartPos = $i;
 413+
 414+ // Handle tag `dws'.
 415+ if ( $name == 'dws' ) {
 416+ $i = $tagEndPos + 1;
 417+ if ( preg_match( '/\s*/', $text, $matches, 0, $i ) ) {
 418+ $i += strlen( $matches[0] );
 419+ }
 420+ $accum .=
 421+ '<ignore>' .
 422+ htmlspecialchars( substr( $text, $tagStartPos, $i - $tagStartPos ) ) .
 423+ '</ignore>';
 424+ continue;
 425+ }
 426+
410427 if ( $text[$tagEndPos-1] == '/' ) {
411428 $attrEnd = $tagEndPos - 1;
412429 $inner = null;

Follow-up revisions

RevisionCommit summaryAuthorDate
r103990Revert r103795 -- adds <dws> pseudotag which modifies preprocessor behavior i...brion00:30, 23 November 2011

Comments

#Comment by OverlordQ (talk | contribs)   20:02, 21 November 2011

Breaks test for bug 32057.

Since the comment parser here completely mangles this, see here

#Comment by OverlordQ (talk | contribs)   20:26, 21 November 2011

FML, was reproducable by running the parser tests.

I tried reproduce it by hand by creating the system message and trying to subst it on a page, which worked.

The parser tests then didn't fail even after deleting the message.

The actual output given in the linked file is when the message doesn't exist, so it might have been a temporary fail or something else strange going on. I did bisect back to this version, however I may have forgotten a layer of caching and so the output from another rev was cached. I'll try to walk it back again.

#Comment by OverlordQ (talk | contribs)   20:52, 21 November 2011

Tried stepping through on a separate install, but couldn't reproduce, so I have no clue what the hiccup was, remarking as new

#Comment by Brion VIBBER (talk | contribs)   00:29, 23 November 2011

I'm not sure we want to do things that modify the preprocessor without *reallllllly* being sure this is desirable and probably talking it over in wikitech-l and possibly wikitext-l.

Status & tagging log