r88871 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r88870‎ | r88871 | r88872 >
Date:23:59, 25 May 2011
Author:brion
Status:ok (Comments)
Tags:
Comment:
* (bug 27465) Fix metadata extraction for SVG files using unusual namespace names

The previous fix (in r82307) only checked explicitly for a namespace given the 'svg' prefix; this fix use XML namespacing support on XMLReader to check for the actual namespace URI correctly.
Fixed up a test case (for RDF extraction) and added trimming on the whitespace.
Also added another test case file that doesn't use a namespace name on the root.
Modified paths:
  • /trunk/phase3/includes/media/SVGMetadataExtractor.php (modified) (history)
  • /trunk/phase3/tests/phpunit/includes/media/SVGMetadataExtractorTest.php (modified) (history)
  • /trunk/phase3/tests/phpunit/includes/media/Wikimedia-logo.svg (added) (history)

Diff [purge]

Index: trunk/phase3/tests/phpunit/includes/media/Wikimedia-logo.svg
@@ -0,0 +1,14 @@
 2+<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 3+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
 4+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" id="Wikimedia logo" viewBox="-599 -599 1198 1198" width="1024" height="1024">
 5+<defs>
 6+ <clipPath id="mask">
 7+ <path d="M 47.5,-87.5 v 425 h -95 v -425 l -552,-552 v 1250 h 1199 v -1250 z"/>
 8+ </clipPath>
 9+</defs>
 10+<g clip-path="url(#mask)">
 11+ <circle id="green parts" fill="#396" r="336.5"/>
 12+ <circle id="blue arc" fill="none" stroke="#069" r="480.25" stroke-width="135.5"/>
 13+</g>
 14+<circle fill="#900" cy="-379.5" r="184.5" id="red circle"/>
 15+</svg>
\ No newline at end of file
Index: trunk/phase3/tests/phpunit/includes/media/SVGMetadataExtractorTest.php
@@ -26,6 +26,13 @@
2727 $base = dirname( __FILE__ );
2828 return array(
2929 array(
 30+ "$base/Wikimedia-logo.svg",
 31+ array(
 32+ 'width' => 1024,
 33+ 'height' => 1024
 34+ )
 35+ ),
 36+ array(
3037 "$base/QA_icon.svg",
3138 array(
3239 'width' => 60,
@@ -42,8 +49,15 @@
4350 array(
4451 "$base/US_states_by_total_state_tax_revenue.svg",
4552 array(
46 - 'width' => 593,
47 - 'height' => 959
 53+ 'height' => 593,
 54+ 'metadata' =>
 55+ '<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
 56+ <ns4:Work xmlns:ns4="http://creativecommons.org/ns#" rdf:about="">
 57+ <ns5:format xmlns:ns5="http://purl.org/dc/elements/1.1/">image/svg+xml</ns5:format>
 58+ <ns5:type xmlns:ns5="http://purl.org/dc/elements/1.1/" rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
 59+ </ns4:Work>
 60+ </rdf:RDF>',
 61+ 'width' => 959
4862 )
4963 ),
5064 );
Index: trunk/phase3/includes/media/SVGMetadataExtractor.php
@@ -35,6 +35,7 @@
3636 class SVGReader {
3737 const DEFAULT_WIDTH = 512;
3838 const DEFAULT_HEIGHT = 512;
 39+ const NS_SVG = 'http://www.w3.org/2000/svg';
3940
4041 private $reader = null;
4142 private $mDebug = false;
@@ -101,9 +102,9 @@
102103 $keepReading = $this->reader->read();
103104 }
104105
105 - if ( $this->reader->name != 'svg' ) {
 106+ if ( $this->reader->localName != 'svg' || $this->reader->namespaceURI != self::NS_SVG ) {
106107 throw new MWException( "Expected <svg> tag, got ".
107 - $this->reader->name );
 108+ $this->reader->localName . " in NS " . $this->reader->namespaceURI );
108109 }
109110 $this->debug( "<svg> tag is correct." );
110111 $this->handleSVGAttribs();
@@ -111,18 +112,19 @@
112113 $exitDepth = $this->reader->depth;
113114 $keepReading = $this->reader->read();
114115 while ( $keepReading ) {
115 - $tag = $this->reader->name;
 116+ $tag = $this->reader->localName;
116117 $type = $this->reader->nodeType;
 118+ $isSVG = ($this->reader->namespaceURI == self::NS_SVG);
117119
118120 $this->debug( "$tag" );
119121
120 - if ( $tag == 'svg' && $type == XmlReader::END_ELEMENT && $this->reader->depth <= $exitDepth ) {
 122+ if ( $isSVG && $tag == 'svg' && $type == XmlReader::END_ELEMENT && $this->reader->depth <= $exitDepth ) {
121123 break;
122 - } elseif ( $tag == 'title' ) {
 124+ } elseif ( $isSVG && $tag == 'title' ) {
123125 $this->readField( $tag, 'title' );
124 - } elseif ( $tag == 'desc' ) {
 126+ } elseif ( $isSVG && $tag == 'desc' ) {
125127 $this->readField( $tag, 'description' );
126 - } elseif ( $tag == 'metadata' && $type == XmlReader::ELEMENT ) {
 128+ } elseif ( $isSVG && $tag == 'metadata' && $type == XmlReader::ELEMENT ) {
127129 $this->readXml( $tag, 'metadata' );
128130 } elseif ( $tag !== '#text' ) {
129131 $this->debug( "Unhandled top-level XML tag $tag" );
@@ -155,7 +157,7 @@
156158 }
157159 $keepReading = $this->reader->read();
158160 while( $keepReading ) {
159 - if( $this->reader->name == $name && $this->reader->nodeType == XmlReader::END_ELEMENT ) {
 161+ if( $this->reader->localName == $name && $this->namespaceURI == self::NS_SVG && $this->reader->nodeType == XmlReader::END_ELEMENT ) {
160162 break;
161163 } elseif( $this->reader->nodeType == XmlReader::TEXT ){
162164 $this->metadata[$metafield] = trim( $this->reader->value );
@@ -175,7 +177,7 @@
176178 return;
177179 }
178180 // TODO: find and store type of xml snippet. metadata['metadataType'] = "rdf"
179 - $this->metadata[$metafield] = $this->reader->readInnerXML();
 181+ $this->metadata[$metafield] = trim( $this->reader->readInnerXML() );
180182 $this->reader->next();
181183 }
182184
@@ -195,11 +197,11 @@
196198 $exitDepth = $this->reader->depth;
197199 $keepReading = $this->reader->read();
198200 while( $keepReading ) {
199 - if( $this->reader->name == $name && $this->reader->depth <= $exitDepth
 201+ if( $this->reader->localName == $name && $this->reader->depth <= $exitDepth
200202 && $this->reader->nodeType == XmlReader::END_ELEMENT ) {
201203 break;
202 - } elseif ( $this->reader->nodeType == XmlReader::ELEMENT ) {
203 - switch( $this->reader->name ) {
 204+ } elseif ( $this->reader->namespaceURI == self::NS_SVG && $this->reader->nodeType == XmlReader::ELEMENT ) {
 205+ switch( $this->reader->localName ) {
204206 case 'animate':
205207 case 'set':
206208 case 'animateMotion':

Follow-up revisions

RevisionCommit summaryAuthorDate
r896771.17: MFT r88492, r88870, r88871, r89003, r89108, r89114, r89115, r89129, r89...catrope19:14, 7 June 2011
r922861.17wmf1 MFT r88871, without test stuff (moved location)reedy19:19, 15 July 2011
r92330REL1_18 MFT r88750, r88870, r88871, r89003, r89005, r89114, r89115, r89129, r...reedy22:56, 15 July 2011
r94277Fix Bug #30322 “SVG metadata is read incorrectly” by applying supplied patchmah19:51, 11 August 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r82307Add support for namespace prefixed elements in svg. ie. <svg:svg>...hartman22:45, 16 February 2011
r88865Test cases for bug 27465: problems with XML namespace handling break SVGMetad...brion23:34, 25 May 2011
r88870Reverting r82307 (bug 27465) as initial step to recommitting a cleaner fix.brion23:57, 25 May 2011

Comments

#Comment by Brion VIBBER (talk | contribs)   00:05, 26 May 2011

Combines with r88871 to make the second diff cleaner. Needs merge to 1.17, 1.17-wmf, 1.18 branches and deployment fix for http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Problems_with_SVG_images

#Comment by Catrope (talk | contribs)   16:15, 7 June 2011

Combine with itself?!?

#Comment by Brion VIBBER (talk | contribs)   00:06, 26 May 2011

Note that the test cases can be left out for merging to 1.17 since the dir structure is different; if want to merge them, they're started in r88865 on trunk.

#Comment by Mormegil (talk | contribs)   11:29, 11 August 2011

There is one missing “reader->” there, breaking the whole thing. See bugzilla:30322.

Status & tagging log