r110900 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r110899‎ | r110900 | r110901 >
Date:00:03, 8 February 2012
Author:tstarling
Status:ok (Comments)
Tags:needs-js-test 
Comment:
Revert r110321: introduces an XSS vulnerability because FormatJson::encode() does not prevent the termination of CDATA sections when JavaScript is embedded in HTML.
Modified paths:
  • /trunk/phase3/includes/Xml.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/Xml.php
@@ -607,19 +607,52 @@
608608 }
609609
610610 /**
611 - * Encode a variable of unknown type to JavaScript. If you're not dealing
612 - * with potential instances of XmlJsCode (which bypass encoding), then
613 - * FormatJson::encode should be used directly.
 611+ * Encode a variable of unknown type to JavaScript.
 612+ * Arrays are converted to JS arrays, objects are converted to JS associative
 613+ * arrays (objects). So cast your PHP associative arrays to objects before
 614+ * passing them to here.
614615 *
615616 * @param $value
616617 *
617618 * @return string
618619 */
619620 public static function encodeJsVar( $value ) {
620 - if ( $value instanceof XmlJsCode ) {
621 - return $value->value;
 621+ if ( is_bool( $value ) ) {
 622+ $s = $value ? 'true' : 'false';
 623+ } elseif ( is_null( $value ) ) {
 624+ $s = 'null';
 625+ } elseif ( is_int( $value ) || is_float( $value ) ) {
 626+ $s = strval($value);
 627+ } elseif ( is_array( $value ) && // Make sure it's not associative.
 628+ array_keys($value) === range( 0, count($value) - 1 ) ||
 629+ count($value) == 0
 630+ ) {
 631+ $s = '[';
 632+ foreach ( $value as $elt ) {
 633+ if ( $s != '[' ) {
 634+ $s .= ',';
 635+ }
 636+ $s .= self::encodeJsVar( $elt );
 637+ }
 638+ $s .= ']';
 639+ } elseif ( $value instanceof XmlJsCode ) {
 640+ $s = $value->value;
 641+ } elseif ( is_object( $value ) || is_array( $value ) ) {
 642+ // Objects and associative arrays
 643+ $s = '{';
 644+ foreach ( (array)$value as $name => $elt ) {
 645+ if ( $s != '{' ) {
 646+ $s .= ',';
 647+ }
 648+
 649+ $s .= '"' . self::escapeJsString( $name ) . '":' .
 650+ self::encodeJsVar( $elt );
 651+ }
 652+ $s .= '}';
 653+ } else {
 654+ $s = '"' . self::escapeJsString( $value ) . '"';
622655 }
623 - return FormatJson::encode( $value );
 656+ return $s;
624657 }
625658
626659 /**

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r110321[Xml::encodeJsVar] Use FormatJson::encode / native json_encode instead...krinkle19:45, 30 January 2012

Comments

#Comment by Krinkle (talk | contribs)   00:37, 8 February 2012

Aggregating from #mediawiki: http://toolserver.org/~mwbot/logs/%23mediawiki/20120208.txt

[00:05:00] <Krinkle>	TimStarling: Eh ? that looks worrying? Is this about Services_JSON or php's native json_encode ?
[00:05:15] <TimStarling>	both
[00:06:02] <TimStarling>	the issue is documented in Xml::escapeJsString
[00:07:37] <Krinkle>	TimStarling: So FormatJson::encode should only be used for complete javascript responses (rather than partial output in a <script> tag).  
[00:07:57] <Krinkle>	TimStarling: In that case, do we need FormatJson::encode at all ? Or is it better performance wise when CDATA is not a concern (e.g. in a load.php response)
[00:08:33] <TimStarling>	both should be secure
[00:08:49] <TimStarling>	we need multiple lines of defense
[00:10:35] <Krinkle>	TimStarling: I mean if our (seemingly much simpler) json encoder in Xml.php is better and safer than Services_JSON, we might as well remove all body of FormatJson::encode and re-route it to Xml methods (instead of the other way around).
[00:12:40] <TimStarling>	PHP 5.3 has a JSON_HEX_TAG option to json_encode() which addresses this issue, as I noted on that bug report
[00:13:10] <TimStarling>	but yes, it doesn't make sense to have two pure PHP implementations, one secure and one subtly flawed
[00:13:24] <Krinkle>	Yeah
[00:13:40] <TimStarling>	I mean, you're one of our best frontend developers, if you don't know about this issue, there's a pretty high chance that it will come up again
[00:14:13] <TimStarling>	I should review all direct calls to json_encode and FormatJson::encode(), maybe there are vulnerabilities already
[00:14:22] <Krinkle>	TimStarling: Well, I obviously knew about < and > escapement being a problem when outputted in <script>
[00:14:37] <Krinkle>	TimStarling: I wrongly assumed it would be taken care of by the encoder we use all over the place
[00:15:27] <Krinkle>	TimStarling: I know there are no direct alls to json_encode in our codebase (other than FormatJson::encode itself), I ack'ed that before
[00:15:55] <Krinkle>	There most likely are calls to FormatJson::encode
[00:17:01] <Krinkle>	TimStarling: So we should raise the requirements before handing off to json_encode to include support for JSON_HEX_TAG, and pass it. And then either fix the php implementation of Services_JSON or let it call Xml.php from FormatJson::encode
[00:17:17] <TimStarling>	yes
[00:17:21] <Krinkle>	aight
[00:17:21] <TimStarling>	most of the code in Services_JSON deals with UTF-8 to UTF-16 conversion
[00:18:10] <TimStarling>	which we don't need because we use a UTF-8 output format, so our strings can have literal unicode characters
[00:18:29] <Krinkle>	Aside from using Services_JSON or the Xml-class methods, I'd prefer not having this Xml.php either way, but that's another issue.
[00:18:29] <TimStarling>	if you were embedding unicode text in a latin-1 HTML document, it would matter
[00:18:47] <TimStarling>	it can move, I don't mind
[00:19:13] <TimStarling>	maybe things like Xml::encodeJsCall() should move too
[00:21:11] <Krinkle>	TimStarling: I'm not too familiar with the encoding backgrounds. I noticed Xml uses \\x3c where JSON_HEX_TAG uses \u003C, that's just the same, right ?
[00:21:25] <TimStarling>	yeah, it's the same
[00:21:27] <Krinkle>	k
[00:22:51] <Krinkle>	TimStarling: Oh, Services_JSON of course also contains decode() which Xml.php does not
[00:26:36] <TimStarling>	the UTF-8 handling in Services_JSON::decode() is wrong and scary
[00:27:41] <TimStarling>	it allows overshort characters to break the syntax
#Comment by Tim Starling (talk | contribs)   01:24, 8 February 2012

Actually there is a difference between \x3c and \u003c: \x3c is not allowed in JSON, although it is allowed in JavaScript. Neither json_decode() nor Services_JSON::decode() recognises \x3c.

Status & tagging log