r67869 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r67868‎ | r67869 | r67870 >
Date:21:57, 11 June 2010
Author:simetrical
Status:ok (Comments)
Tags:
Comment:
Don't allow boolean HTML5 attribs into XHTML1

The code didn't handle the case where a non-associative array was passed
for the attributes, like array( 'autofocus' ) instead of array(
'autofocus' => '' ). In retrospect, allowing this syntax was a bad
decision and I wish I hadn't. Associative arrays shouldn't pretend to
be lists. Probably too much trouble to change it now.
Modified paths:
  • /trunk/phase3/includes/Html.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/Html.php
@@ -180,22 +180,6 @@
181181 if ( $element == 'textarea' && isset( $attribs['maxlength'] ) ) {
182182 unset( $attribs['maxlength'] );
183183 }
184 - # Here we're blacklisting some HTML5-only attributes...
185 - $html5attribs = array(
186 - 'autocomplete',
187 - 'autofocus',
188 - 'max',
189 - 'min',
190 - 'multiple',
191 - 'pattern',
192 - 'placeholder',
193 - 'required',
194 - 'step',
195 - 'spellcheck',
196 - );
197 - foreach ( $html5attribs as $badAttr ) {
198 - unset( $attribs[$badAttr] );
199 - }
200184 }
201185
202186 return "<$element" . self::expandAttributes(
@@ -381,6 +365,22 @@
382366 continue;
383367 }
384368
 369+ # Here we're blacklisting some HTML5-only attributes...
 370+ if ( !$wgHtml5 && in_array( $key, array(
 371+ 'autocomplete',
 372+ 'autofocus',
 373+ 'max',
 374+ 'min',
 375+ 'multiple',
 376+ 'pattern',
 377+ 'placeholder',
 378+ 'required',
 379+ 'step',
 380+ 'spellcheck',
 381+ ) ) ) {
 382+ continue;
 383+ }
 384+
385385 # See the "Attributes" section in the HTML syntax part of HTML5,
386386 # 9.1.2.3 as of 2009-08-10. Most attributes can have quotation
387387 # marks omitted, but not all. (Although a literal " is not

Follow-up revisions

RevisionCommit summaryAuthorDate
r67975MFT r67283, r67869 (bug 23769): disable HTML 5 form validation attributeststarling02:01, 14 June 2010

Comments

#Comment by Tim Starling (talk | contribs)   01:58, 14 June 2010

You might want to think about flipping that array. in_array() is O(N), and isset($a[$k]) is O(1). The setup time for an associative array is slightly longer, but IIRC previous profiling in a similar situation indicated that it makes sense if you have to call the function more than about 10 times.

#Comment by Simetrical (talk | contribs)   14:52, 14 June 2010
0 10:46:08 /var/www/git-trunk/phase3$ time echo '<?php $x = array( "a", "b", "c", "d", "e", "f", "g", "h", "i", "j" ); for ( $i = 0; $i < 1000000; $i++ ) { in_array( "g", $x ); in_array( "q", $x ); }' | php

real	0m2.860s
user	0m2.728s
sys	0m0.024s
0 10:46:12 /var/www/git-trunk/phase3$ time echo '<?php $x = array( "a" => "", "b" => "", "c" => "", "d" => "", "e" => "", "f" => "", "g" => "", "h" => "", "i" => "", "j" => "" ); for ( $i = 0; $i < 1000000; $i++ ) { isset( $x["g"] ); isset( $x["q"] ); }' | php

real	0m0.398s
user	0m0.380s
sys	0m0.008s

That says to me that an associative array would be much faster, but we're talking about a microsecond per call either way. The function is only going to be called once per attribute in the output at most, and pages typically don't have more than a few thousand attributes. Granted, this pattern occurs a number of times in the file, but is this really a useful optimization except in very hot code paths? I'll do it if you want, but it does make the code slightly uglier.

#Comment by Tim Starling (talk | contribs)   01:37, 15 June 2010

The difference it makes depends on the array size. For an array with 22 elements instead of 10, like $boolAttribs, I get a difference of 2.5us per call. About 0.8us of that is actually the function call overhead, judging by the array_key_exists() timings. See Sanitizer::removeHTMLtags() for an example of how ugly it would be. Html::expandAttributes() is probably the hottest function in the Html module, but if you don't want to do it, that's OK with me.

Status & tagging log