r97145 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r97144‎ | r97145 | r97146 >
Date:12:10, 15 September 2011
Author:tstarling
Status:ok (Comments)
Tags:
Comment:
Reverted r85922 and related: new doTableStuff(). I copied in the old doTableStuff() from before r85922 and reverted all parser test changes that looked vaguely related. Apologies to Platonides, since some of his parser tests appeared to be relevant to the old parser, but it's simplest to just revert all the related changes and then re-add any useful tests later. See CR r85922 for full rationale.
Modified paths:
  • /trunk/phase3/CREDITS (modified) (history)
  • /trunk/phase3/includes/Sanitizer.php (modified) (history)
  • /trunk/phase3/includes/parser/Parser.php (modified) (history)
  • /trunk/phase3/tests/parser/parserTests.txt (modified) (history)

Diff [purge]

Index: trunk/phase3/CREDITS
@@ -83,7 +83,6 @@
8484 * Azliq7
8585 * Beau
8686 * Bergi
87 -* Bluehairedlawyer
8887 * Borislav Manolov
8988 * Brad Jorsch
9089 * Brent G
Index: trunk/phase3/tests/parser/parserTests.txt
@@ -1246,10 +1246,8 @@
12471247 |}
12481248 !! result
12491249 <table>
1250 -<caption>caption
1251 -</caption>
1252 -<tr><td></td></tr>
1253 -</table>
 1250+<caption> caption
 1251+</caption><tr><td></td></tr></table>
12541252
12551253 !! end
12561254
@@ -1264,342 +1262,17 @@
12651263 !! result
12661264 <table>
12671265 <tr>
1268 -<td>1
1269 -</td>
1270 -<td>2
1271 -</td>
1272 -</tr>
 1266+<td> 1 </td>
 1267+<td> 2
 1268+</td></tr>
12731269 <tr>
1274 -<td>3
1275 -</td>
1276 -<td>4
1277 -</td>
1278 -</tr>
1279 -</table>
 1270+<td> 3 </td>
 1271+<td> 4
 1272+</td></tr></table>
12801273
12811274 !! end
12821275
12831276 !! test
1284 -Table inside unclosed table w/o cells
1285 -!! input
1286 -{|
1287 -{|
1288 -| foo bar
1289 -|}
1290 -
1291 -!! result
1292 -<table>
1293 -<tr>
1294 -<td>
1295 -<table>
1296 -<tr>
1297 -<td>foo bar
1298 -</td>
1299 -</tr>
1300 -</table>
1301 -</td>
1302 -</tr>
1303 -</table>
1304 -
1305 -!! end
1306 -
1307 -!! test
1308 -Table with thead
1309 -!! input
1310 -{|
1311 -! Number !! Another number
1312 -|-
1313 -| 1 || 2
1314 -|-
1315 -| 3 || 4
1316 -|}
1317 -!! result
1318 -<table>
1319 -<thead>
1320 -<tr>
1321 -<th>Number
1322 -</th>
1323 -<th>Another number
1324 -</th>
1325 -</tr></thead>
1326 -<tbody>
1327 -<tr>
1328 -<td>1
1329 -</td>
1330 -<td>2
1331 -</td>
1332 -</tr>
1333 -<tr>
1334 -<td>3
1335 -</td>
1336 -<td>4
1337 -</td>
1338 -</tr></tbody>
1339 -</table>
1340 -
1341 -!! end
1342 -
1343 -!! test
1344 -Table with multiple captions: Only keep first
1345 -!! input
1346 -{|
1347 -|+ caption 1
1348 -|+ caption 2
1349 -|}
1350 -!! result
1351 -<table>
1352 -<caption>caption 1
1353 -</caption>
1354 -<tr><td></td></tr>
1355 -</table>
1356 -
1357 -!! end
1358 -
1359 -!! test
1360 -Table with multiline caption
1361 -!! input
1362 -{|
1363 -|+ caption 1
1364 -further caption
1365 -|}
1366 -!! result
1367 -<table>
1368 -<caption>caption 1
1369 -further caption
1370 -</caption>
1371 -<tr><td></td></tr>
1372 -</table>
1373 -
1374 -!! end
1375 -!! test
1376 -Table with multiple thead
1377 -!! input
1378 -{|
1379 -! Number !! Another number
1380 -|-
1381 -| 1 || 2
1382 -|-
1383 -! Some other number !! Another number
1384 -|-
1385 -| 3 || 4
1386 -|}
1387 -!! result
1388 -<table>
1389 -<thead>
1390 -<tr>
1391 -<th>Number
1392 -</th>
1393 -<th>Another number
1394 -</th>
1395 -</tr></thead>
1396 -<tbody>
1397 -<tr>
1398 -<td>1
1399 -</td>
1400 -<td>2
1401 -</td>
1402 -</tr></tbody>
1403 -<thead>
1404 -<tr>
1405 -<th>Some other number
1406 -</th>
1407 -<th>Another number
1408 -</th>
1409 -</tr></thead>
1410 -<tbody>
1411 -<tr>
1412 -<td>3
1413 -</td>
1414 -<td>4
1415 -</td>
1416 -</tr></tbody>
1417 -</table>
1418 -
1419 -!! end
1420 -!! test
1421 -Table with thead & tfoot
1422 -!! input
1423 -{|
1424 -! Number !! Another number
1425 -|-
1426 -| 1 || 2
1427 -|-
1428 -! Some other number !! Another number
1429 -|-
1430 -| 3 || 4
1431 -|-
1432 -! Total: 4 !! Total: 6
1433 -|}
1434 -!! result
1435 -<table>
1436 -<thead>
1437 -<tr>
1438 -<th>Number
1439 -</th>
1440 -<th>Another number
1441 -</th>
1442 -</tr></thead>
1443 -<tbody>
1444 -<tr>
1445 -<td>1
1446 -</td>
1447 -<td>2
1448 -</td>
1449 -</tr></tbody>
1450 -<thead>
1451 -<tr>
1452 -<th>Some other number
1453 -</th>
1454 -<th>Another number
1455 -</th>
1456 -</tr></thead>
1457 -<tbody>
1458 -<tr>
1459 -<td>3
1460 -</td>
1461 -<td>4
1462 -</td>
1463 -</tr></tbody>
1464 -<tfoot>
1465 -<tr>
1466 -<th>Total: 4
1467 -</th>
1468 -<th>Total: 6
1469 -</th>
1470 -</tr></tfoot>
1471 -</table>
1472 -
1473 -!! end
1474 -
1475 -!! test
1476 -Table have th inside tfoot
1477 -!! input
1478 -{|
1479 -| cell1 || cell2
1480 -|-
1481 -! Footer1 !! Footer2
1482 -|}
1483 -!! result
1484 -<table>
1485 -<tbody>
1486 -<tr>
1487 -<td>cell1
1488 -</td>
1489 -<td>cell2
1490 -</td>
1491 -</tr></tbody>
1492 -<tfoot>
1493 -<tr>
1494 -<th>Footer1
1495 -</th>
1496 -<th>Footer2
1497 -</th>
1498 -</tr></tfoot>
1499 -</table>
1500 -
1501 -!! end
1502 -
1503 -!! test
1504 -Table have th inside thead
1505 -!! input
1506 -{|
1507 -! Header1 !! Header2
1508 -|-
1509 -| cell1 || cell2
1510 -|}
1511 -!! result
1512 -<table>
1513 -<thead>
1514 -<tr>
1515 -<th>Header1
1516 -</th>
1517 -<th>Header2
1518 -</th>
1519 -</tr></thead>
1520 -<tbody>
1521 -<tr>
1522 -<td>cell1
1523 -</td>
1524 -<td>cell2
1525 -</td>
1526 -</tr></tbody>
1527 -</table>
1528 -
1529 -!! end
1530 -
1531 -!! test
1532 -Table with list inside
1533 -!! input
1534 -{|
1535 -|style="width: 5em; text-align: center"| gives
1536 -|style="border: 1px dashed #2F6FAB; padding: 0.5em; margin: 0.5em"|
1537 -# Some
1538 -# list
1539 -# Lorem
1540 -# ipsum
1541 -# dolor
1542 -|}
1543 -!! result
1544 -<table>
1545 -<tr>
1546 -<td style="width: 5em; text-align: center">gives
1547 -</td>
1548 -<td style="border: 1px dashed #2F6FAB; padding: 0.5em; margin: 0.5em">
1549 -<ol><li> Some
1550 -</li><li> list
1551 -</li><li> Lorem
1552 -</li><li> ipsum
1553 -</li><li> dolor
1554 -</li></ol>
1555 -</td>
1556 -</tr>
1557 -</table>
1558 -
1559 -!! end
1560 -!! test
1561 -Indented table wrapped in html tags (Related to Bug 26362)
1562 -!! input
1563 -<div>
1564 -:{|
1565 -|-
1566 -| test
1567 -|}</div>
1568 -
1569 -!! result
1570 -<div>
1571 -<dl><dd><table>
1572 -<tr>
1573 -<td>test
1574 -</td>
1575 -</tr>
1576 -</table></dd></dl></div>
1577 -
1578 -!! end
1579 -
1580 -!! test
1581 -Table with multiline contents
1582 -!! input
1583 -{|
1584 -| Alice
1585 -Bob
1586 -dfdfg
1587 -dfg
1588 -|}
1589 -!! result
1590 -<table>
1591 -<tr>
1592 -<td>Alice
1593 -<p>Bob
1594 -dfdfg
1595 -dfg
1596 -</p>
1597 -</td>
1598 -</tr>
1599 -</table>
1600 -
1601 -!! end
1602 -
1603 -!! test
16041277 Multiplication table
16051278 !! input
16061279 {| border="1" cellpadding="2"
@@ -1626,69 +1299,47 @@
16271300 <table border="1" cellpadding="2">
16281301 <caption>Multiplication table
16291302 </caption>
1630 -<thead>
16311303 <tr>
1632 -<th>&#215;
1633 -</th>
1634 -<th>1
1635 -</th>
1636 -<th>2
1637 -</th>
1638 -<th>3
1639 -</th>
1640 -</tr></thead>
1641 -<tbody>
 1304+<th> &#215; </th>
 1305+<th> 1 </th>
 1306+<th> 2 </th>
 1307+<th> 3
 1308+</th></tr>
16421309 <tr>
1643 -<th>1
 1310+<th> 1
16441311 </th>
1645 -<td>1
1646 -</td>
1647 -<td>2
1648 -</td>
1649 -<td>3
1650 -</td>
1651 -</tr>
 1312+<td> 1 </td>
 1313+<td> 2 </td>
 1314+<td> 3
 1315+</td></tr>
16521316 <tr>
1653 -<th>2
 1317+<th> 2
16541318 </th>
1655 -<td>2
1656 -</td>
1657 -<td>4
1658 -</td>
1659 -<td>6
1660 -</td>
1661 -</tr>
 1319+<td> 2 </td>
 1320+<td> 4 </td>
 1321+<td> 6
 1322+</td></tr>
16621323 <tr>
1663 -<th>3
 1324+<th> 3
16641325 </th>
1665 -<td>3
1666 -</td>
1667 -<td>6
1668 -</td>
1669 -<td>9
1670 -</td>
1671 -</tr>
 1326+<td> 3 </td>
 1327+<td> 6 </td>
 1328+<td> 9
 1329+</td></tr>
16721330 <tr>
1673 -<th>4
 1331+<th> 4
16741332 </th>
1675 -<td>4
1676 -</td>
1677 -<td>8
1678 -</td>
1679 -<td>12
1680 -</td>
1681 -</tr>
 1333+<td> 4 </td>
 1334+<td> 8 </td>
 1335+<td> 12
 1336+</td></tr>
16821337 <tr>
1683 -<th>5
 1338+<th> 5
16841339 </th>
1685 -<td>5
1686 -</td>
1687 -<td>10
1688 -</td>
1689 -<td>15
1690 -</td>
1691 -</tr></tbody>
1692 -</table>
 1340+<td> 5 </td>
 1341+<td> 10 </td>
 1342+<td> 15
 1343+</td></tr></table>
16931344
16941345 !! end
16951346
@@ -1706,20 +1357,17 @@
17071358 !! result
17081359 <table align="right" border="1">
17091360 <tr>
1710 -<td>Cell 1, row 1
 1361+<td> Cell 1, row 1
17111362 </td>
1712 -<td rowspan="2">Cell 2, row 1 (and 2)
 1363+<td rowspan="2"> Cell 2, row 1 (and 2)
17131364 </td>
1714 -<td>Cell 3, row 1
1715 -</td>
1716 -</tr>
 1365+<td> Cell 3, row 1
 1366+</td></tr>
17171367 <tr>
1718 -<td>Cell 1, row 2
 1368+<td> Cell 1, row 2
17191369 </td>
1720 -<td>Cell 3, row 2
1721 -</td>
1722 -</tr>
1723 -</table>
 1370+<td> Cell 3, row 2
 1371+</td></tr></table>
17241372
17251373 !! end
17261374
@@ -1739,24 +1387,19 @@
17401388 !! result
17411389 <table border="1">
17421390 <tr>
1743 -<td>&#945;
 1391+<td> &#945;
17441392 </td>
17451393 <td>
17461394 <table bgcolor="#ABCDEF" border="2">
17471395 <tr>
17481396 <td>nested
1749 -</td>
1750 -</tr>
 1397+</td></tr>
17511398 <tr>
17521399 <td>table
 1400+</td></tr></table>
17531401 </td>
1754 -</tr>
1755 -</table>
1756 -</td>
17571402 <td>the original table again
1758 -</td>
1759 -</tr>
1760 -</table>
 1403+</td></tr></table>
17611404
17621405 !! end
17631406
@@ -1770,87 +1413,12 @@
17711414 <table>
17721415 <tr>
17731416 <td>broken
1774 -</td>
1775 -</tr>
1776 -</table>
 1417+</td></tr></table>
17771418
17781419 !! end
17791420
1780 -!! test
1781 -Heading inside table (affected by r85922)
1782 -!! input
1783 -{|
1784 -|- valign="top"
1785 -|
1786 -=== Heading ===
1787 -|}
1788 -!! result
1789 -<table>
1790 -<tr valign="top">
1791 -<td>
1792 -<h3><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Parser_test&amp;action=edit&amp;section=1" title="Edit section: Heading">edit</a>]</span> <span class="mw-headline" id="Heading"> Heading </span></h3>
1793 -</td>
1794 -</tr>
1795 -</table>
17961421
1797 -!! end
1798 -
17991422 !! test
1800 -A table with a caption with unclosed italic
1801 -!! input
1802 -{|
1803 -|+ ''caption
1804 -| Cell
1805 -|}
1806 -!! result
1807 -<table>
1808 -<caption><i>caption</i>
1809 -</caption>
1810 -<tr>
1811 -<td>Cell
1812 -</td>
1813 -</tr>
1814 -</table>
1815 -
1816 -!! end
1817 -
1818 -!! test
1819 -A table with unclosed italic in a cell
1820 -!! input
1821 -{|
1822 -| ''Cell
1823 -|}
1824 -!! result
1825 -<table>
1826 -<tr>
1827 -<td><i>Cell</i>
1828 -</td>
1829 -</tr>
1830 -</table>
1831 -
1832 -!! end
1833 -
1834 -!! test
1835 -A table with unclosed italic in a th
1836 -!! input
1837 -{|
1838 -|-
1839 -! ''Cell
1840 -|| Value
1841 -|}
1842 -!! result
1843 -<table>
1844 -<tr>
1845 -<th><i>Cell</i>
1846 -</th>
1847 -<td>Value
1848 -</td>
1849 -</tr>
1850 -</table>
1851 -
1852 -!! end
1853 -
1854 -!! test
18551423 Table security: embedded pipes (http://lists.wikimedia.org/mailman/htdig/wikitech-l/2006-April/022293.html)
18561424 !! input
18571425 {|
@@ -1858,8 +1426,7 @@
18591427 !! result
18601428 <table>
18611429 <tr>
1862 -<td>[<a rel="nofollow" class="external free" href="ftp://%7Cx">ftp://%7Cx</a>
1863 -</td>
 1430+<td>[<a rel="nofollow" class="external free" href="ftp://%7Cx">ftp://%7Cx</a></td>
18641431 <td>]" onmouseover="alert(document.cookie)"&gt;test
18651432 </td>
18661433 </tr>
@@ -1867,71 +1434,7 @@
18681435
18691436 !! end
18701437
1871 -!! test
1872 -Indented Tables, bug 20078
1873 -!! input
1874 -: {|
1875 -| 1 || 2
1876 -|-
1877 -| 3 || 4
1878 -|}
1879 -!! result
1880 -<dl><dd><table>
1881 -<tr>
1882 -<td>1
1883 -</td>
1884 -<td>2
1885 -</td>
1886 -</tr>
1887 -<tr>
1888 -<td>3
1889 -</td>
1890 -<td>4
1891 -</td>
1892 -</tr>
1893 -</table></dd></dl>
18941438
1895 -!! end
1896 -
1897 -!! test
1898 -Arbitrary whitespace should not be prepended
1899 -!! input
1900 -{|
1901 -| 1 || 2
1902 -
1903 -|-
1904 -
1905 -
1906 -| 3 || 4
1907 -|-
1908 -
1909 -| 6 || 8
1910 -|}
1911 -!! result
1912 -<table>
1913 -<tr>
1914 -<td>1
1915 -</td>
1916 -<td>2
1917 -</td>
1918 -</tr>
1919 -<tr>
1920 -<td>3
1921 -</td>
1922 -<td>4
1923 -</td>
1924 -</tr>
1925 -<tr>
1926 -<td>6
1927 -</td>
1928 -<td>8
1929 -</td>
1930 -</tr>
1931 -</table>
1932 -
1933 -!! end
1934 -
1935 -
19361439 ###
19371440 ### Internal links
19381441 ###
@@ -3220,9 +2723,7 @@
32212724 <table>
32222725 <tr>
32232726 <td>[[{{{1}}}|{{{2}}}]]
3224 -</td>
3225 -</tr>
3226 -</table>
 2727+</td></tr></table>
32272728
32282729 !! end
32292730
@@ -3331,18 +2832,13 @@
33322833 </p>
33332834 <table>
33342835 <tr>
3335 -<td>1
3336 -</td>
3337 -<td>2
3338 -</td>
3339 -</tr>
 2836+<td> 1 </td>
 2837+<td> 2
 2838+</td></tr>
33402839 <tr>
3341 -<td>3
3342 -</td>
3343 -<td>4
3344 -</td>
3345 -</tr>
3346 -</table>
 2840+<td> 3 </td>
 2841+<td> 4
 2842+</td></tr></table>
33472843
33482844 !! end
33492845
@@ -3356,18 +2852,13 @@
33572853 </p>
33582854 <table>
33592855 <tr>
3360 -<td>1
3361 -</td>
3362 -<td>2
3363 -</td>
3364 -</tr>
 2856+<td> 1 </td>
 2857+<td> 2
 2858+</td></tr>
33652859 <tr>
3366 -<td>3
3367 -</td>
3368 -<td>4
3369 -</td>
3370 -</tr>
3371 -</table>
 2860+<td> 3 </td>
 2861+<td> 4
 2862+</td></tr></table>
33722863
33732864 !! end
33742865
@@ -5050,10 +4541,8 @@
50514542 !! result
50524543 <table>
50534544 <tr>
5054 -<th class="awesome">status
5055 -</th>
5056 -</tr>
5057 -</table>
 4545+<th class="awesome"> status
 4546+</th></tr></table>
50584547
50594548 !!end
50604549
@@ -5497,10 +4986,8 @@
54984987 !! result
54994988 <table>
55004989 <tr>
5501 -<th style="color:blue">status
5502 -</th>
5503 -</tr>
5504 -</table>
 4990+<th style="color:blue"> status
 4991+</th></tr></table>
55054992
55064993 !!end
55074994
@@ -5513,10 +5000,8 @@
55145001 !! result
55155002 <table>
55165003 <tr>
5517 -<th style="/* insecure input */">status
5518 -</th>
5519 -</tr>
5520 -</table>
 5004+<th style="/* insecure input */"> status
 5005+</th></tr></table>
55215006
55225007 !! end
55235008
@@ -6138,7 +5623,8 @@
61395624 !! result
61405625 <table>
61415626 <tr>
6142 -<td></td>
 5627+<td>
 5628+</td>
61435629 </tr>
61445630 </table>
61455631
@@ -6164,14 +5650,10 @@
61655651 !! input
61665652 ==a==
61675653 {| STYLE=__TOC__
6168 -|foo
61695654 !! result
61705655 <h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Parser_test&amp;action=edit&amp;section=1" title="Edit section: a">edit</a>]</span> <span class="mw-headline" id="a">a</span></h2>
61715656 <table style="&#95;_TOC&#95;_">
6172 -<tr>
6173 -<td>foo
6174 -</td>
6175 -</tr>
 5657+<tr><td></td></tr>
61765658 </table>
61775659
61785660 !! end
@@ -6187,11 +5669,11 @@
61885670 !! result
61895671 <table>
61905672 <tr>
6191 -<th>https://
6192 -</th>
 5673+<th>https://</th>
61935674 <th></th>
61945675 <th></th>
6195 -<th></th>
 5676+<th>
 5677+</td>
61965678 </tr>
61975679 </table>
61985680
@@ -6206,9 +5688,10 @@
62075689 !! result
62085690 <table>
62095691 <tr>
6210 -<th><a rel="nofollow" class="external free" href="irc://{{ftp://a">irc://{{ftp://a</a>" onmouseover="alert('hello world');"
 5692+<th> <a rel="nofollow" class="external free" href="irc://{{ftp://a">irc://{{ftp://a</a>" onmouseover="alert('hello world');"
62115693 </th>
6212 -<td></td>
 5694+<td>
 5695+</td>
62135696 </tr>
62145697 </table>
62155698
@@ -6220,30 +5703,16 @@
62215704 http://===r:::https://b
62225705
62235706 {|
6224 -
62255707 !!result
62265708 <p><a rel="nofollow" class="external free" href="http://===r:::https://b">http://===r:::https://b</a>
62275709 </p>
 5710+<table>
 5711+<tr><td></td></tr>
 5712+</table>
 5713+
62285714 !! end
62295715
62305716 # Known to produce bad XML for now
6231 -
6232 -# Note: the current result listed for this is not what the original one was,
6233 -# but the original bug was JavaScript injection, which is fixed in any case.
6234 -# It's not clear that the original result listed was any more correct than the
6235 -# current one. Original result:
6236 -# <table>
6237 -# {{{|
6238 -# <u class="&#124;">}}}} &gt;
6239 -# <br style="onmouseover=&#39;alert(document.cookie);&#39;" />
6240 -#
6241 -# MOVE YOUR MOUSE CURSOR OVER THIS TEXT
6242 -# <tr>
6243 -# <td></u>
6244 -# </td>
6245 -# </tr>
6246 -# </table>
6247 -# Known to produce bad XML for now
62485717 !! test
62495718 Fuzz testing: Parser24
62505719 !! options
@@ -6258,12 +5727,12 @@
62595728 MOVE YOUR MOUSE CURSOR OVER THIS TEXT
62605729 |
62615730 !! result
6262 -<p>{{{|
 5731+<table>
 5732+{{{|
62635733 <u class="&#124;">}}}} &gt;
62645734 <br style="onmouseover=&#39;alert(document.cookie);&#39;" />
 5735+
62655736 MOVE YOUR MOUSE CURSOR OVER THIS TEXT
6266 -</p>
6267 -<table>
62685737 <tr>
62695738 <td></u>
62705739 </td>
@@ -8472,18 +7941,13 @@
84737942 </p>
84747943 <table>
84757944 <tr>
8476 -<td>1
8477 -</td>
8478 -<td>2
8479 -</td>
8480 -</tr>
 7945+<td> 1 </td>
 7946+<td> 2
 7947+</td></tr>
84817948 <tr>
8482 -<td>3
8483 -</td>
8484 -<td>4
8485 -</td>
8486 -</tr>
8487 -</table>
 7949+<td> 3 </td>
 7950+<td> 4
 7951+</td></tr></table>
84887952 <p>y
84897953 </p>
84907954 !! end
Index: trunk/phase3/includes/parser/Parser.php
@@ -841,318 +841,195 @@
842842 * parse the wiki syntax used to render tables
843843 *
844844 * @private
845 - *
846 - * @param $text string
847 - *
848 - * @return string
849845 */
850846 function doTableStuff( $text ) {
851847 wfProfileIn( __METHOD__ );
852848
853849 $lines = StringUtils::explode( "\n", $text );
854850 $out = '';
855 - $output =& $out;
 851+ $td_history = array(); # Is currently a td tag open?
 852+ $last_tag_history = array(); # Save history of last lag activated (td, th or caption)
 853+ $tr_history = array(); # Is currently a tr tag open?
 854+ $tr_attributes = array(); # history of tr attributes
 855+ $has_opened_tr = array(); # Did this table open a <tr> element?
 856+ $indent_level = 0; # indent level of the table
856857
857858 foreach ( $lines as $outLine ) {
858859 $line = trim( $outLine );
859860
860 - # empty line, go to next line,
861 - # but only append \n if outside of table
862 - if ( $line === '' ) {
863 - $output .= $outLine;
864 - if ( !isset( $tables[0] ) ) {
865 - $output .= "\n";
866 - }
 861+ if ( $line === '' ) { # empty line, go to next line
 862+ $out .= $outLine."\n";
867863 continue;
868864 }
869 - $firstChars = $line[0];
870 - if ( strlen( $line ) > 1 ) {
871 - $firstChars .= in_array( $line[1], array( '}', '+', '-' ) ) ? $line[1] : '';
872 - }
 865+
 866+ $first_character = $line[0];
873867 $matches = array();
874868
875 - if ( preg_match( '/^(:*)\s*\{\|(.*)$/', $line , $matches ) ) {
876 - $tables[] = array();
877 - $table =& $this->last( $tables );
878 - $table[0] = array(); // first row
879 - $currentRow =& $table[0];
880 - $table['indent'] = strlen( $matches[1] );
 869+ if ( preg_match( '/^(:*)\{\|(.*)$/', $line , $matches ) ) {
 870+ # First check if we are starting a new table
 871+ $indent_level = strlen( $matches[1] );
881872
882873 $attributes = $this->mStripState->unstripBoth( $matches[2] );
883874 $attributes = Sanitizer::fixTagAttributes( $attributes , 'table' );
884875
885 - if ( $attributes !== '' ) {
886 - $table['attributes'] = $attributes;
887 - }
888 - } elseif ( !isset( $tables[0] ) ) {
889 - // we're outside the table
 876+ $outLine = str_repeat( '<dl><dd>' , $indent_level ) . "<table{$attributes}>";
 877+ array_push( $td_history , false );
 878+ array_push( $last_tag_history , '' );
 879+ array_push( $tr_history , false );
 880+ array_push( $tr_attributes , '' );
 881+ array_push( $has_opened_tr , false );
 882+ } elseif ( count( $td_history ) == 0 ) {
 883+ # Don't do any of the following
 884+ $out .= $outLine."\n";
 885+ continue;
 886+ } elseif ( substr( $line , 0 , 2 ) === '|}' ) {
 887+ # We are ending a table
 888+ $line = '</table>' . substr( $line , 2 );
 889+ $last_tag = array_pop( $last_tag_history );
890890
891 - $out .= $outLine . "\n";
892 - } elseif ( $firstChars === '|}' ) {
893 - // trim the |} code from the line
894 - $line = substr ( $line , 2 );
895 -
896 - // Shorthand for last row
897 - $lastRow =& $this->last( $table );
898 -
899 - // a thead at the end becomes a tfoot, unless there is only one row
900 - // Do this before deleting empty last lines to allow headers at the bottom of tables
901 - if ( isset( $lastRow['type'] ) && $lastRow['type'] == 'thead' && isset( $table[1] ) ) {
902 - $lastRow['type'] = 'tfoot';
903 - for ( $i = 0; isset( $lastRow[$i] ); $i++ ) {
904 - $lastRow[$i]['type'] = 'th';
905 - }
 891+ if ( !array_pop( $has_opened_tr ) ) {
 892+ $line = "<tr><td></td></tr>{$line}";
906893 }
907894
908 - // Delete empty last lines
909 - if ( empty( $lastRow ) ) {
910 - $lastRow = NULL;
 895+ if ( array_pop( $tr_history ) ) {
 896+ $line = "</tr>{$line}";
911897 }
912 - $o = '';
913 - $curtable = array_pop( $tables );
914898
915 - #Add a line-ending before the table, but only if there isn't one already
916 - if ( substr( $out, -1 ) !== "\n" ) {
917 - $o .= "\n";
 899+ if ( array_pop( $td_history ) ) {
 900+ $line = "</{$last_tag}>{$line}";
918901 }
919 - $o .= $this->generateTableHTML( $curtable ) . $line . "\n";
 902+ array_pop( $tr_attributes );
 903+ $outLine = $line . str_repeat( '</dd></dl>' , $indent_level );
 904+ } elseif ( substr( $line , 0 , 2 ) === '|-' ) {
 905+ # Now we have a table row
 906+ $line = preg_replace( '#^\|-+#', '', $line );
920907
921 - if ( count( $tables ) > 0 ) {
922 - $table =& $this->last( $tables );
923 - $currentRow =& $this->last( $table );
924 - $currentElement =& $this->last( $currentRow );
925 -
926 - $output =& $currentElement['content'];
927 - } else {
928 - $output =& $out;
929 - }
930 -
931 - $output .= $o;
932 -
933 - } elseif ( $firstChars === '|-' ) {
934 - // start a new row element
935 - // but only when we haven't started one already
936 - if ( count( $currentRow ) != 0 ) {
937 - $table[] = array();
938 - $currentRow =& $this->last( $table );
939 - }
940 - // Get the attributes, there's nothing else useful in $line now
941 - $line = substr ( $line , 2 );
 908+ # Whats after the tag is now only attributes
942909 $attributes = $this->mStripState->unstripBoth( $line );
943910 $attributes = Sanitizer::fixTagAttributes( $attributes, 'tr' );
944 - if ( $attributes !== '' ) {
945 - $currentRow['attributes'] = $attributes;
946 - }
 911+ array_pop( $tr_attributes );
 912+ array_push( $tr_attributes, $attributes );
947913
948 - } elseif ( $firstChars === '|+' ) {
949 - // a table caption, but only proceed if there isn't one already
950 - if ( !isset ( $table['caption'] ) ) {
951 - $line = substr ( $line , 2 );
 914+ $line = '';
 915+ $last_tag = array_pop( $last_tag_history );
 916+ array_pop( $has_opened_tr );
 917+ array_push( $has_opened_tr , true );
952918
953 - $c = $this->getCellAttr( $line , 'caption' );
954 - $table['caption'] = array();
955 - $table['caption']['content'] = $c[0];
956 - if ( isset( $c[1] ) ) $table['caption']['attributes'] = $c[1];
957 - unset( $c );
958 - $output =& $table['caption']['content'];
 919+ if ( array_pop( $tr_history ) ) {
 920+ $line = '</tr>';
959921 }
960 - } elseif ( $firstChars === '|' || $firstChars === '!' || $firstChars === '!+' ) {
961 - // Which kind of cells are we dealing with
962 - $currentTag = 'td';
963 - $line = substr ( $line , 1 );
964922
965 - if ( $firstChars === '!' || $firstChars === '!+' ) {
966 - $line = str_replace ( '!!' , '||' , $line );
967 - $currentTag = 'th';
 923+ if ( array_pop( $td_history ) ) {
 924+ $line = "</{$last_tag}>{$line}";
968925 }
969926
970 - // Split up multiple cells on the same line.
971 - $cells = StringUtils::explodeMarkup( '||' , $line );
972 - $line = ''; // save memory
973 -
974 - // decide whether thead to tbody
975 - if ( !array_key_exists( 'type', $currentRow ) ) {
976 - $currentRow['type'] = ( $firstChars === '!' ) ? 'thead' : 'tbody' ;
977 - } elseif ( $firstChars === '|' ) {
978 - $currentRow['type'] = 'tbody';
 927+ $outLine = $line;
 928+ array_push( $tr_history , false );
 929+ array_push( $td_history , false );
 930+ array_push( $last_tag_history , '' );
 931+ } elseif ( $first_character === '|' || $first_character === '!' || substr( $line , 0 , 2 ) === '|+' ) {
 932+ # This might be cell elements, td, th or captions
 933+ if ( substr( $line , 0 , 2 ) === '|+' ) {
 934+ $first_character = '+';
 935+ $line = substr( $line , 1 );
979936 }
980937
981 - // Loop through each table cell
982 - foreach ( $cells as $cell ) {
983 - // a new cell
984 - $currentRow[] = array();
985 - $currentElement =& $this->last( $currentRow );
 938+ $line = substr( $line , 1 );
986939
987 - $currentElement['type'] = $currentTag;
988 -
989 - $c = $this->getCellAttr( $cell , $currentTag );
990 - $currentElement['content'] = $c[0];
991 - if ( isset( $c[1] ) ) $currentElement['attributes'] = $c[1];
992 - unset( $c );
 940+ if ( $first_character === '!' ) {
 941+ $line = str_replace( '!!' , '||' , $line );
993942 }
994 - $output =& $currentElement['content'];
995943
996 - } else {
997 - $output .= "\n$outLine";
998 - }
999 - }
 944+ # Split up multiple cells on the same line.
 945+ # FIXME : This can result in improper nesting of tags processed
 946+ # by earlier parser steps, but should avoid splitting up eg
 947+ # attribute values containing literal "||".
 948+ $cells = StringUtils::explodeMarkup( '||' , $line );
1000949
1001 - # Remove trailing line-ending (b/c)
1002 - if ( substr( $out, -1 ) === "\n" ) {
1003 - $out = substr( $out, 0, -1 );
1004 - }
 950+ $outLine = '';
1005951
1006 - # Close any unclosed tables
1007 - if ( isset( $tables ) && count( $tables ) > 0 ) {
1008 - for ( $i = 0; $i < count( $tables ); $i++ ) {
1009 - $curtable = array_pop( $tables );
1010 - $curtable = $this->generateTableHTML( $curtable );
1011 - #Add a line-ending before the table, but only if there isn't one already
1012 - if ( substr( $out, -1 ) !== "\n" && $curtable !== "" ) {
1013 - $out .= "\n";
1014 - }
1015 - $out .= $curtable;
1016 - }
1017 - }
 952+ # Loop through each table cell
 953+ foreach ( $cells as $cell ) {
 954+ $previous = '';
 955+ if ( $first_character !== '+' ) {
 956+ $tr_after = array_pop( $tr_attributes );
 957+ if ( !array_pop( $tr_history ) ) {
 958+ $previous = "<tr{$tr_after}>\n";
 959+ }
 960+ array_push( $tr_history , true );
 961+ array_push( $tr_attributes , '' );
 962+ array_pop( $has_opened_tr );
 963+ array_push( $has_opened_tr , true );
 964+ }
1018965
1019 - wfProfileOut( __METHOD__ );
 966+ $last_tag = array_pop( $last_tag_history );
1020967
1021 - return $out;
1022 - }
 968+ if ( array_pop( $td_history ) ) {
 969+ $previous = "</{$last_tag}>\n{$previous}";
 970+ }
1023971
1024 - /**
1025 - * Helper function for doTableStuff() separating the contents of cells from
1026 - * attributes. Particularly useful as there's a possible bug and this action
1027 - * is repeated twice.
1028 - *
1029 - * @private
1030 - * @param $cell
1031 - * @param $tagName
1032 - * @return array
1033 - */
1034 - function getCellAttr ( $cell, $tagName ) {
1035 - $attributes = null;
 972+ if ( $first_character === '|' ) {
 973+ $last_tag = 'td';
 974+ } elseif ( $first_character === '!' ) {
 975+ $last_tag = 'th';
 976+ } elseif ( $first_character === '+' ) {
 977+ $last_tag = 'caption';
 978+ } else {
 979+ $last_tag = '';
 980+ }
1036981
1037 - $cell = trim ( $cell );
 982+ array_push( $last_tag_history , $last_tag );
1038983
1039 - // A cell could contain both parameters and data
1040 - $cellData = explode ( '|' , $cell , 2 );
 984+ # A cell could contain both parameters and data
 985+ $cell_data = explode( '|' , $cell , 2 );
1041986
1042 - // Bug 553: Note that a '|' inside an invalid link should not
1043 - // be mistaken as delimiting cell parameters
1044 - if ( strpos( $cellData[0], '[[' ) !== false ) {
1045 - $content = trim ( $cell );
1046 - }
1047 - elseif ( count ( $cellData ) == 1 ) {
1048 - $content = trim ( $cellData[0] );
1049 - } else {
1050 - $attributes = $this->mStripState->unstripBoth( $cellData[0] );
1051 - $attributes = Sanitizer::fixTagAttributes( $attributes , $tagName );
 987+ # Bug 553: Note that a '|' inside an invalid link should not
 988+ # be mistaken as delimiting cell parameters
 989+ if ( strpos( $cell_data[0], '[[' ) !== false ) {
 990+ $cell = "{$previous}<{$last_tag}>{$cell}";
 991+ } elseif ( count( $cell_data ) == 1 ) {
 992+ $cell = "{$previous}<{$last_tag}>{$cell_data[0]}";
 993+ } else {
 994+ $attributes = $this->mStripState->unstripBoth( $cell_data[0] );
 995+ $attributes = Sanitizer::fixTagAttributes( $attributes , $last_tag );
 996+ $cell = "{$previous}<{$last_tag}{$attributes}>{$cell_data[1]}";
 997+ }
1052998
1053 - $content = trim ( $cellData[1] );
 999+ $outLine .= $cell;
 1000+ array_push( $td_history , true );
 1001+ }
 1002+ }
 1003+ $out .= $outLine . "\n";
10541004 }
1055 - return array( $content, $attributes );
1056 - }
10571005
1058 -
1059 - /**
1060 - * Helper function for doTableStuff(). This converts the structured array into html.
1061 - *
1062 - * @private
1063 - *
1064 - * @param $table array
1065 - *
1066 - * @return string
1067 - */
1068 - function generateTableHTML( &$table ) {
1069 - $return = str_repeat( '<dl><dd>' , $table['indent'] );
1070 - $return .= '<table';
1071 - $return .= isset( $table['attributes'] ) ? $table['attributes'] : '';
1072 - $return .= '>';
1073 - unset( $table['attributes'] );
1074 -
1075 - if ( isset( $table['caption'] ) ) {
1076 - $return .= "\n<caption";
1077 - $return .= isset( $table['caption']['attributes'] ) ? $table['caption']['attributes'] : '';
1078 - $return .= '>';
1079 - $return .= $table['caption']['content'];
1080 - $return .= "\n</caption>";
1081 - }
1082 - $lastSection = '';
1083 - $empty = true;
1084 - $simple = true;
1085 -
1086 - // If we only have tbodies, mark table as simple
1087 - for ( $i = 0; isset( $table[$i] ); $i++ ) {
1088 - if ( !count( $table[$i] ) ) continue;
1089 - if ( !isset( $table[$i]['type'] ) ) {
1090 - $table[$i]['type'] = 'tbody';
 1006+ # Closing open td, tr && table
 1007+ while ( count( $td_history ) > 0 ) {
 1008+ if ( array_pop( $td_history ) ) {
 1009+ $out .= "</td>\n";
10911010 }
1092 - if ( !$lastSection ) {
1093 - $lastSection = $table[$i]['type'];
1094 - } elseif ( $lastSection != $table[$i]['type'] ) {
1095 - $simple = false;
 1011+ if ( array_pop( $tr_history ) ) {
 1012+ $out .= "</tr>\n";
10961013 }
1097 - }
1098 - $lastSection = '';
1099 - for ( $i = 0; isset( $table[$i] ); $i++ ) {
1100 - if ( !count( $table[$i] ) ) continue;
1101 - $empty = false; // check for empty tables
1102 -
1103 - if ( $table[$i]['type'] != $lastSection && !$simple ) {
1104 - $return .= "\n<" . $table[$i]['type'] . '>';
 1014+ if ( !array_pop( $has_opened_tr ) ) {
 1015+ $out .= "<tr><td></td></tr>\n" ;
11051016 }
11061017
1107 - $return .= "\n<tr";
1108 - $return .= isset( $table[$i]['attributes'] ) ? $table[$i]['attributes'] : '';
1109 - $return .= '>';
1110 - for ( $j = 0; isset( $table[$i][$j] ); $j++ ) {
1111 - if ( !isset( $table[$i][$j]['type'] ) ) $table[$i][$j]['type'] = 'td';
1112 - $return .= "\n<" . $table[$i][$j]['type'];
1113 - $return .= isset( $table[$i][$j]['attributes'] ) ? $table[$i][$j]['attributes'] : '';
1114 - $return .= '>';
 1018+ $out .= "</table>\n";
 1019+ }
11151020
1116 - $return .= $table[$i][$j]['content'];
1117 - if ( $table[$i][$j]['content'] != '' )
1118 - $return .= "\n";
 1021+ # Remove trailing line-ending (b/c)
 1022+ if ( substr( $out, -1 ) === "\n" ) {
 1023+ $out = substr( $out, 0, -1 );
 1024+ }
11191025
1120 - $return .= '</' . $table[$i][$j]['type'] . '>';
1121 - unset( $table[$i][$j] );
1122 - }
1123 - $return .= "\n</tr>";
1124 -
1125 - if ( ( !isset( $table[$i + 1] ) && !$simple ) || ( isset( $table[$i + 1] ) && isset( $table[$i + 1]['type'] ) && $table[$i]['type'] != $table[$i + 1]['type'] ) ) {
1126 - $return .= '</' . $table[$i]['type'] . '>';
1127 - }
1128 - $lastSection = $table[$i]['type'];
1129 - unset( $table[$i] );
 1026+ # special case: don't return empty table
 1027+ if ( $out === "<table>\n<tr><td></td></tr>\n</table>" ) {
 1028+ $out = '';
11301029 }
1131 - if ( $empty ) {
1132 - if ( isset( $table['caption'] ) ) {
1133 - $return .= "\n<tr><td></td></tr>";
1134 - } else {
1135 - return '';
1136 - }
1137 - }
1138 - $return .= "\n</table>";
1139 - $return .= str_repeat( '</dd></dl>' , $table['indent'] );
11401030
1141 - return $return;
1142 - }
 1031+ wfProfileOut( __METHOD__ );
11431032
1144 - /**
1145 - * like end() but only works on the numeric array index and php's internal pointers
1146 - * returns a reference to the last element of an array much like "\$arr[-1]" in perl
1147 - * ignores associative elements and will create a 0 key will a NULL value if there were
1148 - * no numric elements and an array itself if not previously defined.
1149 - *
1150 - * @private
1151 - *
1152 - * @param $arr array
1153 - */
1154 - function &last ( &$arr ) {
1155 - for ( $i = count( $arr ); ( !isset( $arr[$i] ) && $i > 0 ); $i-- ) { }
1156 - return $arr[$i];
 1033+ return $out;
11571034 }
11581035
11591036 /**
Index: trunk/phase3/includes/Sanitizer.php
@@ -379,7 +379,7 @@
380380 'strike', 'strong', 'tt', 'var', 'div', 'center',
381381 'blockquote', 'ol', 'ul', 'dl', 'table', 'caption', 'pre',
382382 'ruby', 'rt' , 'rb' , 'rp', 'p', 'span', 'abbr', 'dfn',
383 - 'kbd', 'samp', 'thead', 'tbody', 'tfoot'
 383+ 'kbd', 'samp'
384384 );
385385 $htmlsingle = array(
386386 'br', 'hr', 'li', 'dt', 'dd'

Sign-offs

UserFlagDate
MarkAHershbergertested14:23, 16 September 2011
Aaron Schulzinspected20:57, 18 September 2011

Follow-up revisions

RevisionCommit summaryAuthorDate
r97150Update jquery.tablesorter for r97145: emulate <thead> if there is no <thead> ...catrope13:15, 15 September 2011
r97173Merged revisions 97087,97091-97092,97094,97096-97098,97100-97101,97103,97136,...dantman16:19, 15 September 2011
r97459REL1_18 MFT r97145, r97150, r97378...reedy08:21, 19 September 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r85922diebuche21:27, 12 April 2011

Comments

#Comment by Tim Starling (talk | contribs)   12:35, 15 September 2011

Roan is working on the required tablesorter updates.

#Comment by MarkAHershberger (talk | contribs)   14:41, 16 September 2011

Marking per IRC and, heck, this is a revert.

Status & tagging log