r81012 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r81011‎ | r81012 | r81013 >
Date:01:16, 26 January 2011
Author:brion
Status:ok
Tags:
Comment:
Provisionally reverting r80430 (bug 529, 12974)

As noted in CR there are some issues where we're trading one set of weird behavior for a different, but previously unknown, set of weird behavior which breaks existing markup that works around the old behavior.
I'd recommend keeping this in store for after the 1.17 stuff calms down so unexpected parser changes aren't cropping up in the middle of things for people working with trunk.

If these are the right changes to make then great -- but they should be done after the consequences are better understood and folks can prepare for changes.
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/parser/Parser.php (modified) (history)
  • /trunk/phase3/tests/parser/parserTests.txt (modified) (history)

Diff [purge]

Index: trunk/phase3/tests/parser/parserTests.txt
@@ -1210,8 +1210,7 @@
12111211 |}
12121212 !! result
12131213 <table>
1214 -<caption>
1215 -caption
 1214+<caption> caption
12161215 </caption><tr><td></td></tr></table>
12171216
12181217 !! end
@@ -1227,22 +1226,12 @@
12281227 !! result
12291228 <table>
12301229 <tr>
1231 -<td>
1232 -<p>1
1233 -</p>
1234 -</td>
1235 -<td>
1236 -<p>2
1237 -</p>
 1230+<td> 1 </td>
 1231+<td> 2
12381232 </td></tr>
12391233 <tr>
1240 -<td>
1241 -<p>3
1242 -</p>
1243 -</td>
1244 -<td>
1245 -<p>4
1246 -</p>
 1234+<td> 3 </td>
 1235+<td> 4
12471236 </td></tr></table>
12481237
12491238 !! end
@@ -1272,110 +1261,48 @@
12731262 |}
12741263 !! result
12751264 <table border="1" cellpadding="2">
1276 -<caption>
1277 -Multiplication table
 1265+<caption>Multiplication table
12781266 </caption>
12791267 <tr>
1280 -<th>
1281 -<p>&times;
1282 -</p>
1283 -</th>
1284 -<th>
1285 -<p>1
1286 -</p>
1287 -</th>
1288 -<th>
1289 -<p>2
1290 -</p>
1291 -</th>
1292 -<th>
1293 -<p>3
1294 -</p>
 1268+<th> &times; </th>
 1269+<th> 1 </th>
 1270+<th> 2 </th>
 1271+<th> 3
12951272 </th></tr>
12961273 <tr>
1297 -<th>
1298 -<p>1
1299 -</p>
 1274+<th> 1
13001275 </th>
1301 -<td>
1302 -<p>1
1303 -</p>
1304 -</td>
1305 -<td>
1306 -<p>2
1307 -</p>
1308 -</td>
1309 -<td>
1310 -<p>3
1311 -</p>
 1276+<td> 1 </td>
 1277+<td> 2 </td>
 1278+<td> 3
13121279 </td></tr>
13131280 <tr>
1314 -<th>
1315 -<p>2
1316 -</p>
 1281+<th> 2
13171282 </th>
1318 -<td>
1319 -<p>2
1320 -</p>
1321 -</td>
1322 -<td>
1323 -<p>4
1324 -</p>
1325 -</td>
1326 -<td>
1327 -<p>6
1328 -</p>
 1283+<td> 2 </td>
 1284+<td> 4 </td>
 1285+<td> 6
13291286 </td></tr>
13301287 <tr>
1331 -<th>
1332 -<p>3
1333 -</p>
 1288+<th> 3
13341289 </th>
1335 -<td>
1336 -<p>3
1337 -</p>
1338 -</td>
1339 -<td>
1340 -<p>6
1341 -</p>
1342 -</td>
1343 -<td>
1344 -<p>9
1345 -</p>
 1290+<td> 3 </td>
 1291+<td> 6 </td>
 1292+<td> 9
13461293 </td></tr>
13471294 <tr>
1348 -<th>
1349 -<p>4
1350 -</p>
 1295+<th> 4
13511296 </th>
1352 -<td>
1353 -<p>4
1354 -</p>
1355 -</td>
1356 -<td>
1357 -<p>8
1358 -</p>
1359 -</td>
1360 -<td>
1361 -<p>12
1362 -</p>
 1297+<td> 4 </td>
 1298+<td> 8 </td>
 1299+<td> 12
13631300 </td></tr>
13641301 <tr>
1365 -<th>
1366 -<p>5
1367 -</p>
 1302+<th> 5
13681303 </th>
1369 -<td>
1370 -<p>5
1371 -</p>
1372 -</td>
1373 -<td>
1374 -<p>10
1375 -</p>
1376 -</td>
1377 -<td>
1378 -<p>15
1379 -</p>
 1304+<td> 5 </td>
 1305+<td> 10 </td>
 1306+<td> 15
13801307 </td></tr></table>
13811308
13821309 !! end
@@ -1394,26 +1321,16 @@
13951322 !! result
13961323 <table align="right" border="1">
13971324 <tr>
1398 -<td>
1399 -<p>Cell 1, row 1
1400 -</p>
 1325+<td> Cell 1, row 1
14011326 </td>
1402 -<td rowspan="2">
1403 -<p>Cell 2, row 1 (and 2)
1404 -</p>
 1327+<td rowspan="2"> Cell 2, row 1 (and 2)
14051328 </td>
1406 -<td>
1407 -<p>Cell 3, row 1
1408 -</p>
 1329+<td> Cell 3, row 1
14091330 </td></tr>
14101331 <tr>
1411 -<td>
1412 -<p>Cell 1, row 2
1413 -</p>
 1332+<td> Cell 1, row 2
14141333 </td>
1415 -<td>
1416 -<p>Cell 3, row 2
1417 -</p>
 1334+<td> Cell 3, row 2
14181335 </td></tr></table>
14191336
14201337 !! end
@@ -1434,26 +1351,18 @@
14351352 !! result
14361353 <table border="1">
14371354 <tr>
1438 -<td>
1439 -<p>&alpha;
1440 -</p>
 1355+<td> &alpha;
14411356 </td>
14421357 <td>
14431358 <table bgcolor="#ABCDEF" border="2">
14441359 <tr>
1445 -<td>
1446 -<p>nested
1447 -</p>
 1360+<td>nested
14481361 </td></tr>
14491362 <tr>
1450 -<td>
1451 -<p>table
1452 -</p>
 1363+<td>table
14531364 </td></tr></table>
14541365 </td>
1455 -<td>
1456 -<p>the original table again
1457 -</p>
 1366+<td>the original table again
14581367 </td></tr></table>
14591368
14601369 !! end
@@ -1467,9 +1376,7 @@
14681377 !! result
14691378 <table>
14701379 <tr>
1471 -<td>
1472 -<p>broken
1473 -</p>
 1380+<td>broken
14741381 </td></tr></table>
14751382
14761383 !! end
@@ -1483,14 +1390,9 @@
14841391 !! result
14851392 <table>
14861393 <tr>
1487 -<td>
1488 -<p>[<a href="ftp://%7Cx" class="external free" rel="nofollow">ftp://%7Cx</a>
1489 -</p>
 1394+<td>[<a href="ftp://%7Cx" class="external free" rel="nofollow">ftp://%7Cx</a></td>
 1395+<td>]" onmouseover="alert(document.cookie)"&gt;test
14901396 </td>
1491 -<td>
1492 -<p>]" onmouseover="alert(document.cookie)"&gt;test
1493 -</p>
1494 -</td>
14951397 </tr>
14961398 </table>
14971399
@@ -2702,9 +2604,7 @@
27032605 !! result
27042606 <table>
27052607 <tr>
2706 -<td>
2707 -<p>[[{{{1}}}|{{{2}}}]]
2708 -</p>
 2608+<td>[[{{{1}}}|{{{2}}}]]
27092609 </td></tr></table>
27102610
27112611 !! end
@@ -2814,22 +2714,12 @@
28152715 </p>
28162716 <table>
28172717 <tr>
2818 -<td>
2819 -<p>1
2820 -</p>
2821 -</td>
2822 -<td>
2823 -<p>2
2824 -</p>
 2718+<td> 1 </td>
 2719+<td> 2
28252720 </td></tr>
28262721 <tr>
2827 -<td>
2828 -<p>3
2829 -</p>
2830 -</td>
2831 -<td>
2832 -<p>4
2833 -</p>
 2722+<td> 3 </td>
 2723+<td> 4
28342724 </td></tr></table>
28352725
28362726 !! end
@@ -2844,22 +2734,12 @@
28452735 </p>
28462736 <table>
28472737 <tr>
2848 -<td>
2849 -<p>1
2850 -</p>
2851 -</td>
2852 -<td>
2853 -<p>2
2854 -</p>
 2738+<td> 1 </td>
 2739+<td> 2
28552740 </td></tr>
28562741 <tr>
2857 -<td>
2858 -<p>3
2859 -</p>
2860 -</td>
2861 -<td>
2862 -<p>4
2863 -</p>
 2742+<td> 3 </td>
 2743+<td> 4
28642744 </td></tr></table>
28652745
28662746 !! end
@@ -4440,9 +4320,7 @@
44414321 !! result
44424322 <table>
44434323 <tr>
4444 -<th class="awesome">
4445 -<p>status
4446 -</p>
 4324+<th class="awesome"> status
44474325 </th></tr></table>
44484326
44494327 !!end
@@ -4896,9 +4774,7 @@
48974775 !! result
48984776 <table>
48994777 <tr>
4900 -<th style="color:blue">
4901 -<p>status
4902 -</p>
 4778+<th style="color:blue"> status
49034779 </th></tr></table>
49044780
49054781 !!end
@@ -4912,9 +4788,7 @@
49134789 !! result
49144790 <table>
49154791 <tr>
4916 -<th style="/* insecure input */">
4917 -<p>status
4918 -</p>
 4792+<th style="/* insecure input */"> status
49194793 </th></tr></table>
49204794
49214795 !! end
@@ -5583,15 +5457,10 @@
55845458 !! result
55855459 <table>
55865460 <tr>
 5461+<th>https://</th>
 5462+<th></th>
 5463+<th></th>
55875464 <th>
5588 -<p>https://
5589 -</p>
5590 -</th>
5591 -<th>
5592 -</th>
5593 -<th>
5594 -</th>
5595 -<th>
55965465 </td>
55975466 </tr>
55985467 </table>
@@ -5607,9 +5476,7 @@
56085477 !! result
56095478 <table>
56105479 <tr>
5611 -<th>
5612 -<p><a href="irc://{{ftp://a" class="external free" rel="nofollow">irc://{{ftp://a</a>" onmouseover="alert('hello world');"
5613 -</p>
 5480+<th> <a href="irc://{{ftp://a" class="external free" rel="nofollow">irc://{{ftp://a</a>" onmouseover="alert('hello world');"
56145481 </th>
56155482 <td>
56165483 </td>
@@ -5655,9 +5522,7 @@
56565523
56575524 MOVE YOUR MOUSE CURSOR OVER THIS TEXT
56585525 <tr>
5659 -<td>
5660 -<p></u>
5661 -</p>
 5526+<td></u>
56625527 </td>
56635528 </tr>
56645529 </table>
@@ -7772,17 +7637,18 @@
77737638 !!endarticle
77747639
77757640 !! test
7776 -Bug 529/12974: Uncovered bullet
 7641+Bug 529: Uncovered bullet
77777642 !! input
77787643 * Foo {{bullet}}
77797644 !! result
7780 -<ul><li> Foo * Bar
 7645+<ul><li> Foo
 7646+</li><li> Bar
77817647 </li></ul>
77827648
77837649 !! end
77847650
77857651 !! test
7786 -Bug 529/12974: Uncovered table already at line-start
 7652+Bug 529: Uncovered table already at line-start
77877653 !! input
77887654 x
77897655
@@ -7793,33 +7659,24 @@
77947660 </p>
77957661 <table>
77967662 <tr>
7797 -<td>
7798 -<p>1
7799 -</p>
7800 -</td>
7801 -<td>
7802 -<p>2
7803 -</p>
 7663+<td> 1 </td>
 7664+<td> 2
78047665 </td></tr>
78057666 <tr>
7806 -<td>
7807 -<p>3
7808 -</p>
7809 -</td>
7810 -<td>
7811 -<p>4
7812 -</p>
 7667+<td> 3 </td>
 7668+<td> 4
78137669 </td></tr></table>
78147670 <p>y
78157671 </p>
78167672 !! end
78177673
78187674 !! test
7819 -Bug 529/12974: Uncovered bullet in parser function result
 7675+Bug 529: Uncovered bullet in parser function result
78207676 !! input
78217677 * Foo {{lc:{{bullet}} }}
78227678 !! result
7823 -<ul><li> Foo * bar
 7679+<ul><li> Foo
 7680+</li><li> bar
78247681 </li></ul>
78257682
78267683 !! end
Index: trunk/phase3/includes/parser/Parser.php
@@ -814,22 +814,7 @@
815815 $has_opened_tr = array(); # Did this table open a <tr> element?
816816 $indent_level = 0; # indent level of the table
817817
818 - # Keep pulling lines off the front of the array until they're all gone.
819 - # we want to be able to push lines back on to the front of the stream,
820 - # but StringUtils::explode() returns funky optimised Iterators which don't
821 - # support insertion. So maintain a separate buffer and draw on that first if
822 - # there's anything in it
823 - $extraLines = array();
824 - $lines->rewind();
825 - do {
826 - if( $extraLines ){
827 - $outLine = array_shift( $extraLines );
828 - } elseif( $lines->valid() ) {
829 - $outLine = $lines->current();
830 - $lines->next();
831 - } else {
832 - break;
833 - }
 818+ foreach ( $lines as $outLine ) {
834819 $line = trim( $outLine );
835820
836821 if ( $line === '' ) { # empty line, go to next line
@@ -905,10 +890,11 @@
906891 } elseif ( $first_character === '|' || $first_character === '!' || substr( $line , 0 , 2 ) === '|+' ) {
907892 # This might be cell elements, td, th or captions
908893 if ( substr( $line , 0 , 2 ) === '|+' ) {
909 - $first_character = '|+';
 894+ $first_character = '+';
 895+ $line = substr( $line , 1 );
910896 }
911897
912 - $line = substr( $line , strlen( $first_character ) );
 898+ $line = substr( $line , 1 );
913899
914900 if ( $first_character === '!' ) {
915901 $line = str_replace( '!!' , '||' , $line );
@@ -919,84 +905,62 @@
920906 # by earlier parser steps, but should avoid splitting up eg
921907 # attribute values containing literal "||".
922908 $cells = StringUtils::explodeMarkup( '||' , $line );
923 - $cell = array_shift( $cells );
924909
925 - # Inject cells back into the stream to be dealt with later
926 - # TODO: really we should do the whole thing as a stream...
927 - # but that would be too much like a sensible implementation :P
928 - if( count( $cells ) ){
929 - foreach( array_reverse( $cells ) as $extraCell ){
930 - array_unshift( $extraLines, $first_character . $extraCell );
931 - }
932 - }
933 -
934910 $outLine = '';
935911
936 - $previous = '';
937 - if ( $first_character !== '|+' ) {
938 - $tr_after = array_pop( $tr_attributes );
939 - if ( !array_pop( $tr_history ) ) {
940 - $previous = "<tr{$tr_after}>\n";
 912+ # Loop through each table cell
 913+ foreach ( $cells as $cell ) {
 914+ $previous = '';
 915+ if ( $first_character !== '+' ) {
 916+ $tr_after = array_pop( $tr_attributes );
 917+ if ( !array_pop( $tr_history ) ) {
 918+ $previous = "<tr{$tr_after}>\n";
 919+ }
 920+ array_push( $tr_history , true );
 921+ array_push( $tr_attributes , '' );
 922+ array_pop( $has_opened_tr );
 923+ array_push( $has_opened_tr , true );
941924 }
942 - array_push( $tr_history , true );
943 - array_push( $tr_attributes , '' );
944 - array_pop( $has_opened_tr );
945 - array_push( $has_opened_tr , true );
946 - }
947925
948 - $last_tag = array_pop( $last_tag_history );
 926+ $last_tag = array_pop( $last_tag_history );
949927
950 - if ( array_pop( $td_history ) ) {
951 - $previous = "</{$last_tag}>\n{$previous}";
952 - }
 928+ if ( array_pop( $td_history ) ) {
 929+ $previous = "</{$last_tag}>\n{$previous}";
 930+ }
953931
954 - if ( $first_character === '|' ) {
955 - $last_tag = 'td';
956 - } elseif ( $first_character === '!' ) {
957 - $last_tag = 'th';
958 - } elseif ( $first_character === '|+' ) {
959 - $last_tag = 'caption';
960 - } else {
961 - $last_tag = '';
962 - }
 932+ if ( $first_character === '|' ) {
 933+ $last_tag = 'td';
 934+ } elseif ( $first_character === '!' ) {
 935+ $last_tag = 'th';
 936+ } elseif ( $first_character === '+' ) {
 937+ $last_tag = 'caption';
 938+ } else {
 939+ $last_tag = '';
 940+ }
963941
964 - array_push( $last_tag_history , $last_tag );
 942+ array_push( $last_tag_history , $last_tag );
965943
966 - # A cell could contain both parameters and data... but the pipe could
967 - # also be the start of a nested table, or a raw pipe inside an invalid
968 - # link (bug 553).
969 - $cell_data = preg_split( '/(?<!\{)\|/', $cell, 2 );
 944+ # A cell could contain both parameters and data
 945+ $cell_data = explode( '|' , $cell , 2 );
970946
971 - # Bug 553: a '|' inside an invalid link should not
972 - # be mistaken as delimiting cell parameters
973 - if ( strpos( $cell_data[0], '[[' ) !== false ) {
974 - $data = $cell;
975 - $cell = "{$previous}<{$last_tag}>";
976 - } elseif ( count( $cell_data ) == 1 ) {
977 - $cell = "{$previous}<{$last_tag}>";
978 - $data = $cell_data[0];
979 - } else {
980 - $attributes = $this->mStripState->unstripBoth( $cell_data[0] );
981 - $attributes = Sanitizer::fixTagAttributes( $attributes , $last_tag );
982 - $cell = "{$previous}<{$last_tag}{$attributes}>";
983 - $data = $cell_data[1];
984 - }
 947+ # Bug 553: Note that a '|' inside an invalid link should not
 948+ # be mistaken as delimiting cell parameters
 949+ if ( strpos( $cell_data[0], '[[' ) !== false ) {
 950+ $cell = "{$previous}<{$last_tag}>{$cell}";
 951+ } elseif ( count( $cell_data ) == 1 ) {
 952+ $cell = "{$previous}<{$last_tag}>{$cell_data[0]}";
 953+ } else {
 954+ $attributes = $this->mStripState->unstripBoth( $cell_data[0] );
 955+ $attributes = Sanitizer::fixTagAttributes( $attributes , $last_tag );
 956+ $cell = "{$previous}<{$last_tag}{$attributes}>{$cell_data[1]}";
 957+ }
985958
986 - # Bug 529: the start of a table cell should be a linestart context for
987 - # processing other block markup, including nested tables. The original
988 - # implementation of this was to add a newline before every brace construct,
989 - # which broke all manner of other things. Instead, push the contents
990 - # of the cell back into the stream and come back to it later. But don't
991 - # do that if the first line is empty, or you may get extra whitespace
992 - if( $data ){
993 - array_unshift( $extraLines, trim( $data ) );
 959+ $outLine .= $cell;
 960+ array_push( $td_history , true );
994961 }
995 -
996 - $outLine .= $cell;
997 - array_push( $td_history , true );
998962 }
999963 $out .= $outLine . "\n";
1000 - } while( $lines->valid() || count( $extraLines ) );
 964+ }
1001965
1002966 # Closing open td, tr && table
1003967 while ( count( $td_history ) > 0 ) {
@@ -2260,7 +2224,6 @@
22612225 '/(?:<\\/table|<\\/blockquote|<\\/h1|<\\/h2|<\\/h3|<\\/h4|<\\/h5|<\\/h6|'.
22622226 '<td|<th|<\\/?div|<hr|<\\/pre|<\\/p|'.$this->mUniqPrefix.'-pre|<\\/li|<\\/ul|<\\/ol|<\\/?center)/iS', $t );
22632227 if ( $openmatch or $closematch ) {
2264 -
22652228 $paragraphStack = false;
22662229 # TODO bug 5718: paragraph closed
22672230 $output .= $this->closeParagraph();
@@ -3251,11 +3214,11 @@
32523215 $text = wfEscapeWikiText( $text );
32533216 } elseif ( is_string( $text )
32543217 && !$piece['lineStart']
3255 - && preg_match( '/^{\\|/', $text ) )
 3218+ && preg_match( '/^(?:{\\||:|;|#|\*)/', $text ) )
32563219 {
3257 - # Bug 529: if the template begins with a table, it should be treated as
3258 - # beginning a new line. This previously handled other block-level elements
3259 - # such as #, :, etc, but these have many false-positives (bug 12974).
 3220+ # Bug 529: if the template begins with a table or block-level
 3221+ # element, it should be treated as beginning a new line.
 3222+ # This behaviour is somewhat controversial.
32603223 $text = "\n" . $text;
32613224 }
32623225
@@ -3314,7 +3277,7 @@
33153278
33163279 if ( !$title->equals( $cacheTitle ) ) {
33173280 $this->mTplRedirCache[$cacheTitle->getPrefixedDBkey()] =
3318 - array( $title->getNamespace(), $title->getDBkey() );
 3281+ array( $title->getNamespace(), $cdb = $title->getDBkey() );
33193282 }
33203283
33213284 return array( $dom, $title );
Index: trunk/phase3/RELEASE-NOTES
@@ -91,11 +91,6 @@
9292 members.
9393 * (bug 2585) Image pages should send 404 if no image, no shared image and no
9494 description page.
95 -* (bug 12974) The behaviour of wikitable and template parsing was altered. Previously,
96 - all templates were treated as being at the start of a block context for the purposes
97 - of evaluating wikitext elements like #, :, *, and tables (bug 529). Now this
98 - behaviour is only applied to wikitable-start {|, but the first line of a wikitable
99 - cell is now treated as a linestart.
10095 * Custom editintro's using the editintro url parameter will no longer show <noinclude>
10196 sections on pages they are included on.
10297 * (bug 26449) Keep underlines from headings outside of tables and thumbs by

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r80430(bug 529, bug 12974) alter behaviour of the table- and template-parser:...happy-melon23:57, 16 January 2011

Status & tagging log