r59741 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r59740‎ | r59741 | r59742 >
Date:19:39, 4 December 2009
Author:simetrical
Status:ok
Tags:
Comment:
Add DTD to fix well-formedness errors in HTML5

Now actually tested, using Python's SAX module. You can verify that a
page is well-formed XML (or at least won't break in pywikipediabot) with
a program like this:

import xml.sax
class Myhandler(xml.sax.ContentHandler):
pass
h = Myhandler()
xml.sax.parse("http://localhost/git-trunk/phase3/index.php?title=Special:UserLogin",
h)

If the page is not well-formed, this will throw an exception. It did
with the old doctype, but no longer does if $wgWellFormedXml == true.
Modified paths:
  • /trunk/phase3/includes/OutputPage.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/OutputPage.php
@@ -1567,7 +1567,7 @@
15681568 public function headElement( Skin $sk, $includeStyle = true ) {
15691569 global $wgDocType, $wgDTD, $wgContLanguageCode, $wgOutputEncoding, $wgMimeType;
15701570 global $wgXhtmlDefaultNamespace, $wgXhtmlNamespaces, $wgHtml5Version;
1571 - global $wgContLang, $wgUseTrackbacks, $wgStyleVersion, $wgHtml5;
 1571+ global $wgContLang, $wgUseTrackbacks, $wgStyleVersion, $wgHtml5, $wgWellFormedXml;
15721572
15731573 $this->addMeta( "http:Content-Type", "$wgMimeType; charset={$wgOutputEncoding}" );
15741574 if ( $sk->commonPrintStylesheet() ) {
@@ -1588,9 +1588,21 @@
15891589 $dir = $wgContLang->getDir();
15901590
15911591 if ( $wgHtml5 ) {
1592 - $ret .= "<!DOCTYPE html>\n";
 1592+ if ( $wgWellFormedXml ) {
 1593+ # Unknown elements and attributes are okay in XML, but unknown
 1594+ # named entities are well-formedness errors and will break XML
 1595+ # parsers. Thus we need a doctype that gives us appropriate
 1596+ # entity definitions. The HTML5 spec permits four legacy
 1597+ # doctypes as obsolete but conforming, so let's pick one of
 1598+ # those, although it makes our pages look like XHTML1 Strict.
 1599+ # Isn't compatibility great?
 1600+ $ret .= "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">\n";
 1601+ } else {
 1602+ # Much saner.
 1603+ $ret .= "<!doctype html>\n";
 1604+ }
15931605 $ret .= "<html lang=\"$wgContLanguageCode\" dir=\"$dir\" ";
1594 - if ($wgHtml5Version) $ret .= " version=\"$wgHtml5Version\" ";
 1606+ if ( $wgHtml5Version ) $ret .= " version=\"$wgHtml5Version\" ";
15951607 $ret .= ">\n";
15961608 } else {
15971609 $ret .= "<!DOCTYPE html PUBLIC \"$wgDocType\" \"$wgDTD\">\n";

Status & tagging log