r80656 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r80655‎ | r80656 | r80657 >
Date:21:57, 20 January 2011
Author:tparscal
Status:resolved (Comments)
Tags:
Comment:
Resolved bug 26791 by replacing JSMin with a new library called JavaScriptDistiller, which is an improved version of the minification bits from JavaScriptPacker, an LGPL library. Good news - it's 2x faster than our optimized JSMin anyways, and more configurable to boot.
Modified paths:
  • /trunk/phase3/includes/AutoLoader.php (modified) (history)
  • /trunk/phase3/includes/DefaultSettings.php (modified) (history)
  • /trunk/phase3/includes/libs/JSMin.php (deleted) (history)
  • /trunk/phase3/includes/libs/JavaScriptDistiller.php (added) (history)
  • /trunk/phase3/includes/libs/ParseMaster.php (added) (history)
  • /trunk/phase3/includes/resourceloader/ResourceLoader.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/resourceloader/ResourceLoader.php
@@ -108,7 +108,7 @@
109109 * Runs JavaScript or CSS data through a filter, caching the filtered result for future calls.
110110 *
111111 * Available filters are:
112 - * - minify-js \see JSMin::minify
 112+ * - minify-js \see JavaScriptDistiller::stripWhiteSpace
113113 * - minify-css \see CSSMin::minify
114114 *
115115 * If $data is empty, only contains whitespace or the filter was unknown,
@@ -119,6 +119,8 @@
120120 * @return String: Filtered data, or a comment containing an error message
121121 */
122122 protected function filter( $filter, $data ) {
 123+ global $wgResourceLoaderMinifyJSVerticalSpace;
 124+
123125 wfProfileIn( __METHOD__ );
124126
125127 // For empty/whitespace-only data or for unknown filters, don't perform
@@ -144,7 +146,9 @@
145147 try {
146148 switch ( $filter ) {
147149 case 'minify-js':
148 - $result = JSMin::minify( $data );
 150+ $result = JavaScriptDistiller::stripWhiteSpace(
 151+ $data, $wgResourceLoaderMinifyJSVerticalSpace
 152+ );
149153 break;
150154 case 'minify-css':
151155 $result = CSSMin::minify( $data );
Index: trunk/phase3/includes/AutoLoader.php
@@ -138,7 +138,7 @@
139139 'IndexPager' => 'includes/Pager.php',
140140 'Interwiki' => 'includes/Interwiki.php',
141141 'IP' => 'includes/IP.php',
142 - 'JSMin' => 'includes/libs/JSMin.php',
 142+ 'JavaScriptDistiller' => 'includes/libs/JavaScriptDistiller.php',
143143 'LCStore_DB' => 'includes/LocalisationCache.php',
144144 'LCStore_CDB' => 'includes/LocalisationCache.php',
145145 'LCStore_Null' => 'includes/LocalisationCache.php',
@@ -180,6 +180,7 @@
181181 'PageHistory' => 'includes/HistoryPage.php',
182182 'PageHistoryPager' => 'includes/HistoryPage.php',
183183 'Pager' => 'includes/Pager.php',
 184+ 'ParseMaster' => 'includes/libs/ParseMaster.php',
184185 'PasswordError' => 'includes/User.php',
185186 'PatrolLog' => 'includes/PatrolLog.php',
186187 'PhpHttpRequest' => 'includes/HttpFunctions.php',
Index: trunk/phase3/includes/libs/JSMin.php
@@ -1,283 +0,0 @@
2 -<?php
3 -/**
4 - * jsmin.php - PHP implementation of Douglas Crockford's JSMin.
5 - *
6 - * This is pretty much a direct port of jsmin.c to PHP with just a few
7 - * PHP-specific performance tweaks. Also, whereas jsmin.c reads from stdin and
8 - * outputs to stdout, this library accepts a string as input and returns another
9 - * string as output.
10 - *
11 - * PHP 5 or higher is required.
12 - *
13 - * Permission is hereby granted to use this version of the library under the
14 - * same terms as jsmin.c, which has the following license:
15 - *
16 - * --
17 - * Copyright (c) 2002 Douglas Crockford (www.crockford.com)
18 - *
19 - * Permission is hereby granted, free of charge, to any person obtaining a copy of
20 - * this software and associated documentation files (the "Software"), to deal in
21 - * the Software without restriction, including without limitation the rights to
22 - * use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
23 - * of the Software, and to permit persons to whom the Software is furnished to do
24 - * so, subject to the following conditions:
25 - *
26 - * The above copyright notice and this permission notice shall be included in all
27 - * copies or substantial portions of the Software.
28 - *
29 - * The Software shall be used for Good, not Evil.
30 - *
31 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
32 - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
33 - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
34 - * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
35 - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
36 - * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
37 - * SOFTWARE.
38 - * --
39 - *
40 - * @file
41 - * @author Ryan Grove <ryan@wonko.com>
42 - * @copyright 2002 Douglas Crockford <douglas@crockford.com> (jsmin.c)
43 - * @copyright 2008 Ryan Grove <ryan@wonko.com> (PHP port)
44 - * @license http://opensource.org/licenses/mit-license.php MIT License
45 - * @version 1.1.1 (2008-03-02)
46 - * @link http://github.com/rgrove/jsmin-php/
47 - */
48 -
49 -class JSMin {
50 - const ORD_LF = 10;
51 - const ORD_SPACE = 32;
52 -
53 - // Action constants
54 - const OUTPUT = 1;
55 - const DELETE_A = 2;
56 - const DELETE_B = 3;
57 -
58 - /** Current character */
59 - protected $a = '';
60 -
61 - /** Next character */
62 - protected $b = '';
63 -
64 - protected $input = '';
65 - protected $inputIndex = 0;
66 - protected $inputLength = 0;
67 - protected $lookAhead = null;
68 - protected $output = '';
69 -
70 - // -- Public Static Methods --------------------------------------------------
71 -
72 - public static function minify( $js ) {
73 - $jsmin = new self( $js );
74 - $ret = $jsmin->min();
75 - return $ret;
76 - }
77 -
78 - // -- Public Instance Methods ------------------------------------------------
79 -
80 - public function __construct( $input ) {
81 - // Fix line endings
82 - $this->input = str_replace( "\r\n", "\n", $input );
83 - // Replace tabs and other control characters (except LF) with spaces
84 - $this->input = preg_replace( '/[\x00-\x09\x0b-\x1f]/', ' ', $this->input );
85 - $this->inputLength = strlen( $this->input );
86 - }
87 -
88 - // -- Protected Instance Methods ---------------------------------------------
89 -
90 - /**
91 - * Do something! What you do is determined by the argument:
92 - * - self::OUTPUT Output A. Copy B to A. Get the next B.
93 - * - self::DELETE_A Copy B to A. Get the next B. (Delete A).
94 - * - self::DELETE_B Get the next B. (Delete B).
95 - * action treats a string as a single character. Wow!
96 - * action recognizes a regular expression if it is preceded by ( or , or =.
97 - */
98 - protected function action( $d ) {
99 - switch( $d ) {
100 - case self::OUTPUT:
101 - $this->output .= $this->a;
102 -
103 - case self::DELETE_A:
104 - $this->a = $this->b;
105 -
106 - if ( $this->a === "'" || $this->a === '"' ) {
107 - $interestingChars = $this->a . "\\\n";
108 - $this->output .= $this->a;
109 - for ( ; ; ) {
110 - $runLength = strcspn( $this->input, $interestingChars, $this->inputIndex );
111 - $this->output .= substr( $this->input, $this->inputIndex, $runLength );
112 - $this->inputIndex += $runLength;
113 - $c = $this->get();
114 -
115 - if ( $c === $this->b ) {
116 - break;
117 - }
118 -
119 - if ( $c === "\n" || $c === null ) {
120 - throw new JSMinException( 'Unterminated string literal.' );
121 - }
122 -
123 - if ( $c === '\\' ) {
124 - $this->output .= $c . $this->get();
125 - }
126 - }
127 - }
128 -
129 - case self::DELETE_B:
130 - $this->b = $this->next();
131 -
132 - if ( $this->b === '/' && (
133 - $this->a === '(' || $this->a === ',' || $this->a === '=' ||
134 - $this->a === ':' || $this->a === '[' || $this->a === '!' ||
135 - $this->a === '&' || $this->a === '|' || $this->a === '?' ) ) {
136 -
137 - $this->output .= $this->a . $this->b;
138 -
139 - for ( ; ; ) {
140 - $runLength = strcspn( $this->input, "/\\\n", $this->inputIndex );
141 - $this->output .= substr( $this->input, $this->inputIndex, $runLength );
142 - $this->inputIndex += $runLength;
143 - $this->a = $this->get();
144 -
145 - if ( $this->a === '/' ) {
146 - break;
147 - } elseif ( $this->a === '\\' ) {
148 - $this->output .= $this->a;
149 - $this->a = $this->get();
150 - } elseif ( $this->a === "\n" || $this->a === null ) {
151 - throw new JSMinException( 'Unterminated regular expression ' .
152 - 'literal.' );
153 - }
154 -
155 - $this->output .= $this->a;
156 - }
157 -
158 - $this->b = $this->next();
159 - }
160 - }
161 - }
162 -
163 - /**
164 - * Return the next character from the input. Watch out for lookahead. If
165 - * the character is a control character, translate it to a space or
166 - * linefeed.
167 - */
168 - protected function get() {
169 - if ( $this->inputIndex < $this->inputLength ) {
170 - return $this->input[$this->inputIndex++];
171 - } else {
172 - return null;
173 - }
174 - }
175 -
176 - /**
177 - * Return true if the character is a letter, digit, underscore,
178 - * dollar sign, or non-ASCII character.
179 - */
180 - protected function isAlphaNum( $c ) {
181 - return ord( $c ) > 126 || $c === '\\' || preg_match( '/^[\w\$]$/', $c ) === 1;
182 - }
183 -
184 - /**
185 - * Copy the input to the output, deleting the characters which are
186 - * insignificant to JavaScript. Comments will be removed. Tabs will be
187 - * replaced with spaces. Carriage returns will be replaced with linefeeds.
188 - * Most spaces and linefeeds will be removed.
189 - */
190 - protected function min() {
191 - $this->a = "\n";
192 - $this->action( self::DELETE_B );
193 -
194 - while ( $this->a !== null ) {
195 - switch ( $this->a ) {
196 - case ' ':
197 - if ( $this->isAlphaNum( $this->b ) ) {
198 - $this->action( self::OUTPUT );
199 - } else {
200 - $this->action( self::DELETE_A );
201 - }
202 - break;
203 -
204 - case "\n":
205 - switch ( $this->b ) {
206 - case ' ':
207 - $this->action( self::DELETE_B );
208 - break;
209 -
210 - default:
211 - $this->action( self::OUTPUT );
212 - }
213 - break;
214 -
215 - default:
216 - switch ( $this->b ) {
217 - case ' ':
218 - if ( $this->isAlphaNum( $this->a ) ) {
219 - $this->action( self::OUTPUT );
220 - break;
221 - }
222 -
223 - $this->action( self::DELETE_B );
224 - break;
225 - default:
226 - $this->action( self::OUTPUT );
227 - break;
228 - }
229 - }
230 - }
231 -
232 - // Remove initial line break
233 - if ( $this->output[0] !== "\n" ) {
234 - throw new JSMinException( 'Unexpected lack of line break.' );
235 - }
236 - if ( $this->output === "\n" ) {
237 - return '';
238 - } else {
239 - return substr( $this->output, 1 );
240 - }
241 - }
242 -
243 - /**
244 - * Get the next character, excluding comments.
245 - */
246 - protected function next() {
247 - if ( $this->inputIndex >= $this->inputLength ) {
248 - return null;
249 - }
250 - $c = $this->input[$this->inputIndex++];
251 -
252 - if ( $this->inputIndex >= $this->inputLength ) {
253 - return $c;
254 - }
255 -
256 - if ( $c === '/' ) {
257 - switch( $this->input[$this->inputIndex] ) {
258 - case '/':
259 - $this->inputIndex += strcspn( $this->input, "\n", $this->inputIndex ) + 1;
260 - return "\n";
261 - case '*':
262 - $endPos = strpos( $this->input, '*/', $this->inputIndex + 1 );
263 - if ( $endPos === false ) {
264 - throw new JSMinException( 'Unterminated comment.' );
265 - }
266 - $numLines = substr_count( $this->input, "\n", $this->inputIndex,
267 - $endPos - $this->inputIndex );
268 - $this->inputIndex = $endPos + 2;
269 - if ( $numLines ) {
270 - return str_repeat( "\n", $numLines );
271 - } else {
272 - return ' ';
273 - }
274 - default:
275 - return $c;
276 - }
277 - }
278 -
279 - return $c;
280 - }
281 -}
282 -
283 -// -- Exceptions ---------------------------------------------------------------
284 -class JSMinException extends Exception {}
Index: trunk/phase3/includes/libs/JavaScriptDistiller.php
@@ -0,0 +1,75 @@
 2+<?php
 3+/**
 4+ * JavaScript Distiller
 5+ *
 6+ * Author: Dean Edwards, Nicholas Martin, Trevor Parscal
 7+ * License: LGPL
 8+ */
 9+class JavaScriptDistiller {
 10+
 11+ /* Static Methods */
 12+
 13+ /**
 14+ * Removes most of the white-space from JavaScript code.
 15+ *
 16+ * This code came from the first pass of Dean Edwards' JavaScript Packer. Compared to using
 17+ * JSMin::minify, this produces < 1% larger output (after gzip) in approx. 25% of the time.
 18+ *
 19+ * @param $script String: JavaScript code to minify
 20+ */
 21+ public static function stripWhiteSpace( $script, $collapseVertical = false ) {
 22+ // This parser is based on regular expressions, which all get or'd together, so rules take
 23+ // precedence in the order they are added. We can use it to minify by armoring certain
 24+ // regions by matching them and replacing them with the full match, leaving the remaining
 25+ // regions around for further matching and replacing.
 26+ $parser = new ParseMaster();
 27+ // There is a bug in ParseMaster that causes a backslash at the end of a line to be changed
 28+ // to \s if we use a backslash as the escape character. We work around this by using an
 29+ // obscure escape character that we hope will never appear at the end of a line.
 30+ $parser->escapeChar = chr( 1 );
 31+ // Protect strings. The original code had [^\'\\v] here, but that didn't armor multiline
 32+ // strings correctly. This also armors multiline strings that don't have backslashes at the
 33+ // end of the line (these are invalid), but that's fine because we're just armoring here.
 34+ $parser->add('/\'[^\']*\'/', '$1' );
 35+ $parser->add('/"[^"]*"/', '$1' );
 36+ // Remove comments
 37+ $parser->add('/\\/\\/[^\v]*[\v]/', ' ');
 38+ $parser->add('/\\/\\*[^*]*\\*+([^\\/][^*]*\\*+)*\\//', ' ');
 39+ // Protect regular expressions
 40+ $parser->add('/\\h+(\\/[^\\/\\v\\*][^\\/\\v]*\\/g?i?)/', '$2'); // IGNORE
 41+ $parser->add('/[^\\w\\x24\\/\'"*)\\?:]\\/[^\\/\\v\\*][^\\/\\v]*\\/g?i?/', '$1');
 42+ // Remove: ;;; doSomething();
 43+ $parser->add('/;;;[^\\v]+[\\v]/');
 44+ // Remove redundant semi-colons
 45+ $parser->add('/\\(;;\\)/', '$1'); // protect for (;;) loops
 46+ $parser->add('/;+\\h*([};])/', '$2');
 47+ // Apply all rules defined up to this point
 48+ $script = $parser->exec($script);
 49+ // If requested, make some vertical whitespace collapsing as well
 50+ if ( $collapseVertical ) {
 51+ // Collapse whitespaces between and after a ){ pair (function definitions)
 52+ $parser->add('/\\)\\s+\\{\\s+/', '){');
 53+ // Collapse whitespaces between and after a ({ pair (JSON argument)
 54+ $parser->add('/\\(\\s+\\{\\s+/', '({');
 55+ // Collapse whitespaces between a parenthesis and a period (call chaining)
 56+ $parser->add('/\\)\\s+\\./', ').');
 57+ // Collapse vertical whitespaces which come directly after a semicolon or a comma
 58+ $parser->add('/([;,])\\s+/', '$2');
 59+ // Collapse whitespaces between multiple parenthesis/brackets of similar direction
 60+ $parser->add('/([\\)\\}])\\s+([\\)\\}])/', '$2$3');
 61+ $parser->add('/([\\(\\{])\\s+([\\(\\{])/', '$2$3');
 62+ }
 63+ // Collapse horizontal whitespaces between variable names into a single space
 64+ $parser->add('/(\\b|\\x24)\\h+(\\b|\\x24)/', '$2 $3');
 65+ // Collapse horizontal whitespaces between urinary operators into a single space
 66+ $parser->add('/([+\\-])\\h+([+\\-])/', '$2 $3');
 67+ // Collapse all remaining un-protected horizontal whitespace
 68+ $parser->add('/\\h+/', '');
 69+ // Collapse multiple vertical whitespaces with some horizontal spaces between them
 70+ $parser->add('/\\v+\\h*\\v*/', "\n");
 71+
 72+ // Done
 73+ return $parser->exec($script);
 74+
 75+ }
 76+}
Property changes on: trunk/phase3/includes/libs/JavaScriptDistiller.php
___________________________________________________________________
Added: svn:eol-style
177 + native
Index: trunk/phase3/includes/libs/ParseMaster.php
@@ -0,0 +1,214 @@
 2+<?php
 3+/**
 4+ * ParseMaster, version 1.0.2 (2005-08-19) Copyright 2005, Dean Edwards
 5+ * A multi-pattern parser.
 6+ * License: http://creativecommons.org/licenses/LGPL/2.1/
 7+ *
 8+ * This is the PHP version of the ParseMaster component of Dean Edwards' (http://dean.edwards.name/)
 9+ * Packer, which was originally written in JavaScript. It was ported to PHP by Nicolas Martin.
 10+ *
 11+ * Original Source: http://joliclic.free.fr/php/javascript-packer/en/
 12+ *
 13+ * Changes should be pushed back upstream.
 14+ */
 15+class ParseMaster {
 16+ public $ignoreCase = false;
 17+ public $escapeChar = '';
 18+
 19+ // constants
 20+ const EXPRESSION = 0;
 21+ const REPLACEMENT = 1;
 22+ const LENGTH = 2;
 23+
 24+ // used to determine nesting levels
 25+ private $GROUPS = '/\\(/';//g
 26+ private $SUB_REPLACE = '/\\$\\d/';
 27+ private $INDEXED = '/^\\$\\d+$/';
 28+ private $TRIM = '/([\'"])\\1\\.(.*)\\.\\1\\1$/';
 29+ private $ESCAPE = '/\\\./';//g
 30+ private $QUOTE = '/\'/';
 31+ private $DELETED = '/\\x01[^\\x01]*\\x01/';//g
 32+
 33+ public function add($expression, $replacement = '') {
 34+ // count the number of sub-expressions
 35+ // - add one because each pattern is itself a sub-expression
 36+ $length = 1 + preg_match_all($this->GROUPS, $this->_internalEscape((string)$expression), $out);
 37+
 38+ // treat only strings $replacement
 39+ if (is_string($replacement)) {
 40+ // does the pattern deal with sub-expressions?
 41+ if (preg_match($this->SUB_REPLACE, $replacement)) {
 42+ // a simple lookup? (e.g. "$2")
 43+ if (preg_match($this->INDEXED, $replacement)) {
 44+ // store the index (used for fast retrieval of matched strings)
 45+ $replacement = (int)(substr($replacement, 1)) - 1;
 46+ } else { // a complicated lookup (e.g. "Hello $2 $1")
 47+ // build a function to do the lookup
 48+ $quote = preg_match($this->QUOTE, $this->_internalEscape($replacement))
 49+ ? '"' : "'";
 50+ $replacement = array(
 51+ 'fn' => '_backReferences',
 52+ 'data' => array(
 53+ 'replacement' => $replacement,
 54+ 'length' => $length,
 55+ 'quote' => $quote
 56+ )
 57+ );
 58+ }
 59+ }
 60+ }
 61+ // pass the modified arguments
 62+ if (!empty($expression)) $this->_add($expression, $replacement, $length);
 63+ else $this->_add('/^$/', $replacement, $length);
 64+ }
 65+
 66+ public function exec($string) {
 67+ // execute the global replacement
 68+ $this->_escaped = array();
 69+
 70+ // simulate the _patterns.toSTring of Dean
 71+ $regexp = '/';
 72+ foreach ($this->_patterns as $reg) {
 73+ $regexp .= '(' . substr($reg[self::EXPRESSION], 1, -1) . ')|';
 74+ }
 75+ $regexp = substr($regexp, 0, -1) . '/';
 76+ $regexp .= ($this->ignoreCase) ? 'i' : '';
 77+
 78+ $string = $this->_escape($string, $this->escapeChar);
 79+ $string = preg_replace_callback(
 80+ $regexp,
 81+ array(
 82+ &$this,
 83+ '_replacement'
 84+ ),
 85+ $string
 86+ );
 87+ $string = $this->_unescape($string, $this->escapeChar);
 88+
 89+ return preg_replace($this->DELETED, '', $string);
 90+ }
 91+
 92+ public function reset() {
 93+ // clear the patterns collection so that this object may be re-used
 94+ $this->_patterns = array();
 95+ }
 96+
 97+ // private
 98+ private $_escaped = array(); // escaped characters
 99+ private $_patterns = array(); // patterns stored by index
 100+
 101+ // create and add a new pattern to the patterns collection
 102+ private function _add() {
 103+ $arguments = func_get_args();
 104+ $this->_patterns[] = $arguments;
 105+ }
 106+
 107+ // this is the global replace function (it's quite complicated)
 108+ private function _replacement($arguments) {
 109+ if (empty($arguments)) return '';
 110+
 111+ $i = 1; $j = 0;
 112+ // loop through the patterns
 113+ while (isset($this->_patterns[$j])) {
 114+ $pattern = $this->_patterns[$j++];
 115+ // do we have a result?
 116+ if (isset($arguments[$i]) && ($arguments[$i] != '')) {
 117+ $replacement = $pattern[self::REPLACEMENT];
 118+
 119+ if (is_array($replacement) && isset($replacement['fn'])) {
 120+
 121+ if (isset($replacement['data'])) $this->buffer = $replacement['data'];
 122+ return call_user_func(array(&$this, $replacement['fn']), $arguments, $i);
 123+
 124+ } elseif (is_int($replacement)) {
 125+ return $arguments[$replacement + $i];
 126+
 127+ }
 128+ $delete = ($this->escapeChar == '' ||
 129+ strpos($arguments[$i], $this->escapeChar) === false)
 130+ ? '' : "\x01" . $arguments[$i] . "\x01";
 131+ return $delete . $replacement;
 132+
 133+ // skip over references to sub-expressions
 134+ } else {
 135+ $i += $pattern[self::LENGTH];
 136+ }
 137+ }
 138+ }
 139+
 140+ private function _backReferences($match, $offset) {
 141+ $replacement = $this->buffer['replacement'];
 142+ $quote = $this->buffer['quote'];
 143+ $i = $this->buffer['length'];
 144+ while ($i) {
 145+ $replacement = str_replace('$'.$i--, $match[$offset + $i], $replacement);
 146+ }
 147+ return $replacement;
 148+ }
 149+
 150+ private function _replace_name($match, $offset){
 151+ $length = strlen($match[$offset + 2]);
 152+ $start = $length - max($length - strlen($match[$offset + 3]), 0);
 153+ return substr($match[$offset + 1], $start, $length) . $match[$offset + 4];
 154+ }
 155+
 156+ private function _replace_encoded($match, $offset) {
 157+ return $this->buffer[$match[$offset]];
 158+ }
 159+
 160+
 161+ // php : we cannot pass additional data to preg_replace_callback,
 162+ // and we cannot use &$this in create_function, so let's go to lower level
 163+ private $buffer;
 164+
 165+ // encode escaped characters
 166+ private function _escape($string, $escapeChar) {
 167+ if ($escapeChar) {
 168+ $this->buffer = $escapeChar;
 169+ return preg_replace_callback(
 170+ '/\\' . $escapeChar . '(.)' .'/',
 171+ array(&$this, '_escapeBis'),
 172+ $string
 173+ );
 174+
 175+ } else {
 176+ return $string;
 177+ }
 178+ }
 179+ private function _escapeBis($match) {
 180+ $this->_escaped[] = $match[1];
 181+ return $this->buffer;
 182+ }
 183+
 184+ // decode escaped characters
 185+ private function _unescape($string, $escapeChar) {
 186+ if ($escapeChar) {
 187+ $regexp = '/'.'\\'.$escapeChar.'/';
 188+ $this->buffer = array('escapeChar'=> $escapeChar, 'i' => 0);
 189+ return preg_replace_callback
 190+ (
 191+ $regexp,
 192+ array(&$this, '_unescapeBis'),
 193+ $string
 194+ );
 195+
 196+ } else {
 197+ return $string;
 198+ }
 199+ }
 200+ private function _unescapeBis() {
 201+ if (isset($this->_escaped[$this->buffer['i']])
 202+ && $this->_escaped[$this->buffer['i']] != '')
 203+ {
 204+ $temp = $this->_escaped[$this->buffer['i']];
 205+ } else {
 206+ $temp = '';
 207+ }
 208+ $this->buffer['i']++;
 209+ return $this->buffer['escapeChar'] . $temp;
 210+ }
 211+
 212+ private function _internalEscape($string) {
 213+ return preg_replace($this->ESCAPE, '', $string);
 214+ }
 215+}
Property changes on: trunk/phase3/includes/libs/ParseMaster.php
___________________________________________________________________
Added: svn:eol-style
1216 + native
Index: trunk/phase3/includes/DefaultSettings.php
@@ -2448,6 +2448,8 @@
24492449 */
24502450 $wgResourceLoaderUseESI = false;
24512451
 2452+$wgResourceLoaderMinifyJSVerticalSpace = false;
 2453+
24522454 /** @} */ # End of resource loader settings }
24532455
24542456

Follow-up revisions

RevisionCommit summaryAuthorDate
r80666Addresses issues raised in an excellent review of r80656.tparscal00:03, 21 January 2011
r806771.17: MFT r80109, r80113, r80223, r80475, r80554, r80575, r80620, r80621, r80...catrope03:12, 21 January 2011
r80900Resolves issue in r80656 where escaped quotes within strings are not handled ...tparscal19:47, 24 January 2011
r810001.17: MFT r80576, r80583, r80656, r80842, r80900, r80913, r80918, r80919, r80...catrope22:49, 25 January 2011
r81141JSMin is gone, please talk with JavaScriptDistiller...platonides17:51, 28 January 2011

Comments

#Comment by Catrope (talk | contribs)   22:59, 20 January 2011

$wgResourceLoaderMinifyJSVerticalSpace needs a doc comment.

I feel it'd be nicer to put ParseMaster inside the JavaScriptDistiller file, so we don't clutter the libs directory too much and don't contaminate the class namespace and autoloader list as much.

Some rules use \h and \v, which are only available in PHP 5.2.4 and later. Since it's very easy to support earlier PHP versions by just replacing them with their respective character classes, we should do that.

Also, the roles of $1, $2, ... are shifted by one position from their usual roles ($1 is the whole match, which is usually $0; $2 is the first submatch, which is usually $1; etc.), this should be indicated in a comment.

The whitespace transformations after the first call to $parser->exec() seem to apply to previously armored literals as well, which would be bad.

There is special casing for collapsing horizontal whitespace between certain things, followed by unconditional collapsing (which isn't really collapsing, but removal) of all horizontal whitespace; that doesn't make sense.

+		$parser->add('/;+\\h*([};])/', '$2');

This rule removes two kinds of redundant semicolons: semicolons followed by a semicolon (fine) and semicolons followed by a closing brace. I dislike the latter because it relies on semicolon insertion, which IMO is a scary "feature" of JS that should be avoided.

+		$parser->add('/(\\b|\\x24)\\h+(\\b|\\x24)/', '$2 $3');

The use of \\x24 instead of \\$ for the dollar sign is kinda silly, unless the ParseMaster class requires it. It should be changed to use a plain dollar symbol if possible, or documented as \x24 = $ otherwise.

+		// Collapse horizontal whitespaces between urinary operators into a single space

Using 'whitespaces' (there is no plural of 'whitespace' other than 'whitespace characters' maybe), OK, maybe, but for "urinary operators" you just deserve to be ridiculed :P

#Comment by Trevor Parscal (WMF) (talk | contribs)   00:06, 21 January 2011

Best typo ever!

r80666 should fix all that up.

#Comment by Nikerabbit (talk | contribs)   13:40, 23 January 2011

Uncaught SyntaxError: Unexpected token ILLEGAL

	document.write("\x3cscript src=\"//bits.translatewiki.net/sandwiki/load.php?debug=true\x26amp;lang=fi\x26amp;modules=jquery%7Cmediawiki\x26amp;only=scripts\x26amp;skin=vector\x26amp;version=20110120T072820Z\"\x3e\x3c/script\x3e");

Minified:

document.write("\x3cscript src=\"}
#Comment by Catrope (talk | contribs)   17:54, 24 January 2011

The problem seems to be with the \", which JSDistiller seems to think ends the string.

#Comment by Trevor Parscal (WMF) (talk | contribs)   19:52, 24 January 2011

Because we changed the escape character from \ to 0x1 I think this was a side-effect. It's been sorted in r80900.

#Comment by Platonides (talk | contribs)   18:00, 28 January 2011

Why naming the method stripWhiteSpace() ?

Just minify() as the old one would be more clear for reusers (independently of that it internally calls stripWhiteSpace() or a dozen functions).

What if we want to add another step later?

#Comment by Trevor Parscal (WMF) (talk | contribs)   18:50, 28 January 2011

Meh - wasn't too bother with it myself, but feel free to name it what you think is sensible.

#Comment by Tim Starling (talk | contribs)   06:45, 18 February 2011

What test case did you use for your benchmark against JSMin? It only seems about 8% faster to me, with jquery.js.

Status & tagging log