r107556 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r107555‎ | r107556 | r107557 >
Date:09:29, 29 December 2011
Author:santhosh
Status:reverted (Comments)
Tags:
Comment:
Use jqueryMsg wikitext parser to parse interface messages at client side. Support for PLURAL in javascript.
Modified paths:
  • /trunk/phase3/RELEASE-NOTES-1.19 (modified) (history)
  • /trunk/phase3/resources/Resources.php (modified) (history)
  • /trunk/phase3/resources/mediawiki/mediawiki.js (modified) (history)

Diff [purge]

Index: trunk/phase3/RELEASE-NOTES-1.19
@@ -110,6 +110,8 @@
111111 * Added $wgSend404Code, true by default, which can be set to false to send a
112112 200 status code instead of 404 for nonexistent articles.
113113 * (bug 23427) Introduced {{PAGEID}} variable to expose page.page_id
 114+* Use jqueryMsg wikitext parser to parse interface messages at client side.
 115+ Support for PLURAL in javascript.
114116
115117 === Bug fixes in 1.19 ===
116118 * $wgUploadNavigationUrl should be used for file redlinks if.
Index: trunk/phase3/resources/Resources.php
@@ -706,6 +706,7 @@
707707 'jquery.placeholder',
708708 'jquery.mw-jump',
709709 'mediawiki.util',
 710+ 'mediawiki.jqueryMsg'
710711 ),
711712 ),
712713 'mediawiki.page.startup' => array(
Index: trunk/phase3/resources/mediawiki/mediawiki.js
@@ -158,14 +158,9 @@
159159 }
160160 return '<' + this.key + '>';
161161 }
162 - var text = this.map.get( this.key ),
163 - parameters = this.parameters;
164 -
165 - text = text.replace( /\$(\d+)/g, function ( str, match ) {
166 - var index = parseInt( match, 10 ) - 1;
167 - return parameters[index] !== undefined ? parameters[index] : '$' + match;
168 - } );
169 -
 162+
 163+ var text = mw.jqueryMsg.getMessageFunction( )( this.key, this.parameters );
 164+
170165 if ( this.format === 'plain' ) {
171166 return text;
172167 }

Follow-up revisions

RevisionCommit summaryAuthorDate
r107571Revert r107556 based on the discussion.santhosh14:06, 29 December 2011
r108203Use mw.jqueryMsg parser for message parsing to support PLURAL and GENDER...santhosh09:14, 6 January 2012

Comments

#Comment by Mormegil (talk | contribs)   11:51, 29 December 2011

http://integration.mediawiki.org/testswarm/job/362/

mw.jqueryMsg is undefined - { "fileName": "r107556/resources/mediawiki/mediawiki.js https://integration.mediawiki.org/checkouts/mw/trunk/r107556/resources/mediawiki/mediawiki.js", "lineNumber": 162 }

#Comment by Krinkle (talk | contribs)   12:33, 29 December 2011

The dependency is added to the wrong module. mw.msg has got nothing to do with mediawiki.page.ready

The core mediawiki module cannot have any dependencies, so this construction cannot exist, please revert.

Perhaps in the longer term we can find a good way for this, but for now this can't be done, simply use mw.jqueryMsg directly where you need it (add it as a dependency to the module where you use it).

#Comment by Nikerabbit (talk | contribs)   12:41, 29 December 2011

What do you say about moving mw.message into its own module (which can have dependencies) but include it in the default set that is always loaded?

#Comment by Krinkle (talk | contribs)   13:31, 29 December 2011

Well, we can move mw.message as-is (including the Message constructor function and it's prototype) into a separate module, but I'm not sure we should also add the dependency to mw.jqueryMsg to it. Right now it's not used anywhere in core, why load it on all pages by default ?

I don't see the problem with modules that need the parser to add "'dependencies' => 'mw.jqueryMsg'" and use it.

If it would then be used by an extension that is loaded on all pages, that's fine, but I don't think we should literally fix that it should be loaded on all pages. That takes away from the ResourceLoader principle of loading what is needed for the code that is currently on the page.

Also note that moving the base mw.message into a separate module means that all usages everywhere of "mw.msg" need to add "mediawiki.message" do it's dependencies, that's going to cause breakages.

#Comment by Siebrand (talk | contribs)   13:55, 29 December 2011

> Right now it's not used anywhere in core, why load it on all pages by default ?

It will be used more really soon. By all extensions that now have broken plurals in JS -- my guess is that at least 10-15 are affected--, and probably by all JS messages using numbers. Not sure if there are any in core.

#Comment by Santhosh.thottingal (talk | contribs)   14:22, 4 January 2012

Since we discussed about this in the wikitech-l, and decided to go ahead with mw.jqueryMsg dependency till the approach suggested by Roan is ready, could you please suggest a better place to define the dependency other than mediawiki.page.ready? May be with base modules defined in OutputPage.php ?

#Comment by Krinkle (talk | contribs)   12:42, 29 December 2011

Also note that unit tests dependencies are kept up to date manually. So next time keep ./phase3/tests/qunit/index.html in sync to load everything correctly (and while at it, run QUnit tests in your browser before committing something like this).

The manually keeping up to date is a known issue and is fixed in the JSTesting branch which I plan to merge into trunk later this week.

If we do this, please do not change the "plain" format. Right now this revision basically brakes the "plain" method. There is a comment 5 lines down from that diff with a place holder for a parse mode. I'd recommend, if anything, to instead utilize the parse() placeholder and turn it into an asynchronous method (both to allow lazy loading of the module as well as futuristic plans for parsing that interacts with the API):

var parser;

parse: function ( callback ) {
	var msg = this;

	msg.format = 'parse';

	mw.loader.using( 'mw.jqueryMsg', function () {
		parser = parser || mw.jqueryMsg.getMessageFunction( );
		callback( parser( msg.key, msg.parameters ) );
	});
},
#Comment by NeilK (talk | contribs)   22:27, 29 December 2011

I think we need some data to make an appropriate decision here.

jQueryMsg is kind of a monstrous-looking JS package. But we should determine whether it really does bloat the frontend or slow message parsing down, and if so, is it worth it.

Next: if it really is too bloaty, then we should do what I always intended to do, which is to move "parsing" to the server. 90% of the library could be ported to PHP, which would then emit regular messages as regular strings, and messages that had PLURAL and so on in them would be emitted as little JSON structures. Those JSON structures are easy to deal with, and implementing PLURAL/GRAMMAR/etc. would be a very, very tiny library.

#Comment by NeilK (talk | contribs)   22:30, 29 December 2011

Oh, and in between those steps, there are some simple things we can do to speed up jQueryMsg, if the problem is speed (and we decide to not care so much about size).

#Comment by Nikerabbit (talk | contribs)   07:21, 30 December 2011

The impact is about 2 KiB in normal mode and 7 KiB in debug mode. That's much smaller than most of the modules. The basic set we load seems to be over 100 KiB in normal mode when loading Special:BlankPage with many extensions installed.

#Comment by Catrope (talk | contribs)   18:23, 2 January 2012

I like the idea of moving message parsing to the server side that way. I have ideas about how it can be done, and I'll take crack at it today or tomorrow. If what I have in mind turns out to work well, the required support on the JS side will be so small that I'll just merge it into the main mw.msg library, and this whole problem will go away.

#Comment by NeilK (talk | contribs)   20:00, 2 January 2012

So cryptic! :)

Anyway, here's what I had in mind.

The parser part of mediawiki.language.parser.js was generated with PEG, so it should be possible to port to PHP. Fortunately, the original JS generated by that PEG library will be easy to port to PHP.

Unfortunately, I hand-hacked the generated javascript to be significantly slimmer. And then I added maybe a few trivial features.

Suggested procedure:

0) Port the relevant tests in test/jasmine/ to PHP

1) Backport any changes in mw.jqueryMsg.js to mw.jqueryMsg.peg (you might have to look at the earlier incarnation, mediawiki.language.parser.js in UploadWizard/resources.

2) Use http://pegjs.majda.cz/ to generate from that PEG. You will get some very procedural looking JS.

3) Port that to JS, which should be really easy.

4) Check it with the tests

5) Do some magic with resourceloader to allow requesting stuff in a "parsed" format.

And that would solve the frontend bloat problem, although this is as far as I can tell at least a week's work to get really right, and really only save like 6K. (You'd still need 1K to use it). But it may be worth it.

Also, at the same time, I have been doing some speed tests... even without parsing there is a speed penalty for using jQueryMsg (it's 1/4-1/5 the speed). This may not matter on the average page where you are showing just a few hundred messages max. I have an uncommitted change which will speed things up considerably though, still working on it.

#Comment by NeilK (talk | contribs)   20:01, 2 January 2012

Sorry, that's 6K saved in *debug* mode. In optimized mode you're saving about 1.5K.

#Comment by NeilK (talk | contribs)   20:10, 2 January 2012

Oh, and here's the jsperf where I've been working on a hybrid strategy -- use regexes sometimes and use jQuery other times.

http://jsperf.com/jquerymsg-vs-msg-2

The thing is, this ruins much of the appeal of jQueryMsg, which is that you never need to escape things. The library already knows what kind of content you are targeting so it can use the appropriate escape. If we add a ".msgAttr" feature, then we can dispense with even more cruft when adding messages to attributes.

So possibly the entire jsperf effort was a bad idea, and all regex string hackery should die. This is why I haven't committed the changes yet.

#Comment by NeilK (talk | contribs)   20:14, 2 January 2012

Oh and one more thing -- with regexes, It's way harder to do backslash-escaping of characters like { and $ as you will see in some of the tests in the source of that jsPerf. You need several passes. At least jQueryMsg deals with that in one pass.

#Comment by Catrope (talk | contribs)   20:16, 2 January 2012

Sorry for being so cryptic :)

I was gonna look at the feasibility of using the preprocessor DOM output to generate the "little JSON structures" you speak of, and then just integrate the whole thing with the existing message delivery parts of RL.

#Comment by Catrope (talk | contribs)   21:06, 2 January 2012

Here's a hacky proof of concept in eval.php, I'll work on this some more in the morning (and hopefully I'll catch Tim then).

catrope@roanLaptop:~/mediawiki/trunk/phase3$ php maintenance/eval.php
# Use $wgParser for something so it gets unstubbed. This is a hack and probably not needed in a practical implementation
> $output = $wgParser->parse( "There {{PLURAL:$1|is one article|are $1 articles}} on this wiki.", Title::newMainPage(), new ParserOptions );

# Clone $wgParser
> $parser = clone $wgParser;

# Our own function to handle PLURAL, outputs a JSON representation of the call
> function myPlural( $parser, $text = '' ) { $forms = array_slice( func_get_args(), 2 ); return '", {"func": "plural", "text": ' . json_encode($text) . ', "forms": ' . json_encode($forms) . '}, "'; }

# Install the function as the hook for PLURAL, overriding the default hook
> $parser->setFunctionHook( 'plural', 'myPlural', SFH_NO_HASH );

# Run the text that we want to parse through the preprocessor (parse() will work too, if we want HTML)
> $output2 = $parser->preprocess( "There {{PLURAL:$1|is one article|are $1 articles}} on this wiki.", Title::newMainPage(), new ParserOptions );

# Wrap the output to make it proper JSON
> $json = '["' . $output2 . '"]';

# The JSON output
> echo $json
["There ", {"func": "plural", "text": "$1", "forms": ["is one article","are $1 articles"]}, " on this wiki."]

# PHP array representation of the JSON output
> var_dump(json_decode($json, true))
array(3) {
  [0]=>
  string(6) "There "
  [1]=>
  array(3) {
    ["func"]=>
    string(6) "plural"
    ["text"]=>
    string(2) "$1"
    ["forms"]=>
    array(2) {
      [0]=>
      string(14) "is one article"
      [1]=>
      string(15) "are $1 articles"
    }
  }
  [2]=>
  string(14) " on this wiki."
}
#Comment by NeilK (talk | contribs)   21:23, 2 January 2012

Ah, I see. Well I did not see any useful way to instrument the parser when I had first looked into this or I would have done it your way. That looks very promising.

#Comment by NeilK (talk | contribs)   21:33, 2 January 2012

There is one small issue though (which is still a bug in jQueryMsg itself) is that you have to parse HTML simultaneously. Because each "atom" of text is treated as a whole jQuery object (and "This " results in "This ", PLURAL, "").

That could be fixed by adding primitive HTML parsing to the PEG thing -- I'm not sure how you fix it with the MediaWiki parser.

Still though, if that's the only bug we could live with it.

Status & tagging log