r91608 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r91607‎ | r91608 | r91609 >
Date:21:48, 6 July 2011
Author:brion
Status:resolved (Comments)
Tags:
Comment:
* (bug 28626) Validate JavaScript files and pages loaded via ResourceLoader before minification, protecting separate modules from interference

This is possibly not perfect but seems to serve for a start; follows up on r91591 that adds JSMin+ to use it in some unit tests. May want to adjust some related bits.

- $wgResourceLoaderValidateJs on by default (can be disabled)
- when loading a JS file through ResourceLoaderFileModule or ResourceLoaderWikiModule, parse it using JSMinPlus's JSParser class. If the parser throws an exception, the JS code of the offending file will be replaced by a JS exception throw listing the file or page name, line number (in original form), and description of the error from the parser.
- parsing results are cached based on md5 of content to avoid re-parsing identical text
- for JS pages loaded via direct load.php request, the parse error is thrown and visible in the JS console/error log

Issues:
- the primary use case for this is when a single load.php request implements multiple modules via mw.loader.implement() -- the loader catches the exception and skips on to the next module (good) but doesn't re-throw the exception for the JS console. It does log to console if present, but it'll only show up as a regular debug message, not an error. This can suppress visibility of errors in a module that's loaded together with other modules (such as a gadget).
- have not done performance testing on the JSParser
- have not done thorough unit testing with the JSParser
Modified paths:
  • /trunk/phase3/includes/DefaultSettings.php (modified) (history)
  • /trunk/phase3/includes/resourceloader/ResourceLoaderFileModule.php (modified) (history)
  • /trunk/phase3/includes/resourceloader/ResourceLoaderModule.php (modified) (history)
  • /trunk/phase3/includes/resourceloader/ResourceLoaderWikiModule.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/resourceloader/ResourceLoaderFileModule.php
@@ -495,6 +495,7 @@
496496 if ( $contents === false ) {
497497 throw new MWException( __METHOD__.": script file not found: \"$localPath\"" );
498498 }
 499+ $contents = $this->validateScriptFile( $fileName, $contents );
499500 $js .= $contents . "\n";
500501 }
501502 return $js;
Index: trunk/phase3/includes/resourceloader/ResourceLoaderModule.php
@@ -304,4 +304,54 @@
305305 public function isKnownEmpty( ResourceLoaderContext $context ) {
306306 return false;
307307 }
 308+
 309+
 310+ /** @var JSParser lazy-initialized; use self::javaScriptParser() */
 311+ private static $jsParser;
 312+ private static $parseCacheVersion = 1;
 313+
 314+ /**
 315+ * Validate a given script file; if valid returns the original source.
 316+ * If invalid, returns replacement JS source that throws an exception.
 317+ *
 318+ * @param string $fileName
 319+ * @param string $contents
 320+ * @return string JS with the original, or a replacement error
 321+ */
 322+ protected function validateScriptFile( $fileName, $contents ) {
 323+ global $wgResourceLoaderValidateJS;
 324+ if ( $wgResourceLoaderValidateJS ) {
 325+ // Try for cache hit
 326+ // Use CACHE_ANYTHING since filtering is very slow compared to DB queries
 327+ $key = wfMemcKey( 'resourceloader', 'jsparse', self::$parseCacheVersion, md5( $contents ) );
 328+ $cache = wfGetCache( CACHE_ANYTHING );
 329+ $cacheEntry = $cache->get( $key );
 330+ if ( is_string( $cacheEntry ) ) {
 331+ return $cacheEntry;
 332+ }
 333+
 334+ $parser = self::javaScriptParser();
 335+ try {
 336+ $parser->parse( $contents, $fileName, 1 );
 337+ $result = $contents;
 338+ } catch (Exception $e) {
 339+ // We'll save this to cache to avoid having to validate broken JS over and over...
 340+ $err = $e->getMessage();
 341+ $result = "throw new Error(" . Xml::encodeJsVar("JavaScript parse error: $err") . ");";
 342+ }
 343+
 344+ $cache->set( $key, $result );
 345+ return $result;
 346+ } else {
 347+ return $contents;
 348+ }
 349+ }
 350+
 351+ protected static function javaScriptParser() {
 352+ if ( !self::$jsParser ) {
 353+ self::$jsParser = new JSParser();
 354+ }
 355+ return self::$jsParser;
 356+ }
 357+
308358 }
Index: trunk/phase3/includes/resourceloader/ResourceLoaderWikiModule.php
@@ -82,6 +82,7 @@
8383 }
8484 $script = $this->getContent( $title );
8585 if ( strval( $script ) !== '' ) {
 86+ $script = $this->validateScriptFile( $titleText, $script );
8687 if ( strpos( $titleText, '*/' ) === false ) {
8788 $scripts .= "/* $titleText */\n";
8889 }
Index: trunk/phase3/includes/DefaultSettings.php
@@ -2519,6 +2519,12 @@
25202520 */
25212521 $wgResourceLoaderMaxQueryLength = -1;
25222522
 2523+/**
 2524+ * If set to true, JavaScript will be parsed prior to minification to validate it.
 2525+ * Parse errors will result in a JS exception being thrown during module load.
 2526+ */
 2527+$wgResourceLoaderValidateJS = true;
 2528+
25232529 /** @} */ # End of resource loader settings }
25242530
25252531

Follow-up revisions

RevisionCommit summaryAuthorDate
r91914Followup to r91608: reduce impact of bug 29784 (high jsmin+ memory usage duri...brion21:39, 11 July 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r91591Followup r83885: add JSMin+ 1.3 to use its parser to verify output of JavaScr...brion20:02, 6 July 2011

Comments

#Comment by Nikerabbit (talk | contribs)   22:14, 6 July 2011

No non-js logging yet?

#Comment by Brion VIBBER (talk | contribs)   22:52, 6 July 2011

There could perhaps be some benefit to logging if we're attempting to detect bad .js files creeping into source. For user JS it's probably unnecessary noise to record that.

#Comment by Bryan (talk | contribs)   12:26, 7 July 2011

You should add the file name in the JavaScript error thrown.

#Comment by Brion VIBBER (talk | contribs)   17:53, 7 July 2011

The file name and line number are already included in the exception message.

#Comment by Bryan (talk | contribs)   18:23, 7 July 2011

But those are the line number in the concatenated file, right?

#Comment by Brion VIBBER (talk | contribs)   18:30, 7 July 2011

No, the JS parser is being run over the original non-concatenated files as they're fetched, so the filename (actual file or wiki page title) and line number will be for the original source.

Note that if we wanted to just switch to using JSMin+ as the minifier, we'd have three options:

  • continue to do an explicit parse run on each file just for error-checking purposes, then run JSMin+ to parse & minify the concatenated batch (problem: double parsing is slower)
  • minify the individual files separately before concatenation, so parse errors can get reported individually (problem: may require more retooling of ResourceLoader, which applies minification as a filter on top of multiple things already concatenated together)
  • parse each file separately, then combine the parse trees and minify based on that (problems: all the retooling pain from above, plus more ;)
#Comment by Aaron Schulz (talk | contribs)   23:42, 3 August 2011

Why not make $jsParser a static member of the javaScriptParser() function?

#Comment by Brion VIBBER (talk | contribs)   11:42, 4 August 2011

I'm generally not a fan of static variables within functions; it feels icky and non-obvious, and there's usually *no way to reset them* (say, when changing configuration during a test run or a script that has to handle things for multiple sites) because they can't be accessed from outside. Also can't reuse the variable from another method if you discover you need it.

#Comment by Aaron Schulz (talk | contribs)   16:25, 4 August 2011

I'd agree on the reset problem.

Status & tagging log