r59232 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r59231‎ | r59232 | r59233 >
Date:00:37, 19 November 2009
Author:dantman
Status:ok (Comments)
Tags:
Comment:
Add a wgCategories variable to the js variables so users don't need to do really ugly dom searching to see if a category is in the categorylinks box.
Modified paths:
  • /trunk/phase3/includes/OutputPage.php (modified) (history)
  • /trunk/phase3/includes/Skin.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/OutputPage.php
@@ -12,7 +12,7 @@
1313 var $mHTMLtitle = '', $mIsarticle = true, $mPrintable = false;
1414 var $mSubtitle = '', $mRedirect = '', $mStatusCode;
1515 var $mLastModified = '', $mETag = false;
16 - var $mCategoryLinks = array(), $mLanguageLinks = array();
 16+ var $mCategoryLinks = array(), $mCategories = array(), $mLanguageLinks = array();
1717
1818 var $mScriptLoaderClassList = array();
1919
@@ -664,6 +664,10 @@
665665 return $this->mCategoryLinks;
666666 }
667667
 668+ public function getCategories() {
 669+ return $this->mCategories;
 670+ }
 671+
668672 /**
669673 * Add an array of categories, with names in the keys
670674 */
@@ -713,6 +717,7 @@
714718 if ( array_key_exists( $category, $categories ) )
715719 continue;
716720 $text = $wgContLang->convertHtml( $title->getText() );
 721+ $this->mCategories[] = $title->getText();
717722 $this->mCategoryLinks[$type][] = $sk->link( $title, $text );
718723 }
719724 }
Index: trunk/phase3/includes/Skin.php
@@ -416,6 +416,7 @@
417417 'wgFormattedNamespaces' => $wgContLang->getFormattedNamespaces(),
418418 'wgNamespaceIds' => $wgContLang->getNamespaceIds(),
419419 'wgSiteName' => $wgSitename,
 420+ 'wgCategories' => $wgOut->getCategories(),
420421 );
421422 if ( $wgContLang->hasVariants() ) {
422423 $vars['wgUserVariant'] = $wgContLang->getPreferredVariant();

Comments

#Comment by Simetrical (talk | contribs)   00:05, 11 December 2009

Is this really necessary? We already have a huge number of JS variables output. They're taking up more and more space and will almost all be unused on the overwhelming majority of page views. And worst, we can't easily remove them after they're added unless we want to break everyone's scripts. If something can be retrieved from the DOM with five lines of jQuery, I think that's way beyond what needs to be added to the JS variables.

#Comment by Dantman (talk | contribs)   00:22, 11 December 2009

We're already doing all the work needed to generate the variable, what's so bad about that same data in a way a script can use it? The way it would be done (ab)using jQuery is slow, a complete hack, and may break due to skin changes. jQuery was NOT developed for that purpose.

#Comment by Simetrical (talk | contribs)   14:56, 11 December 2009

What's bad is that we currently output no fewer than 43 JS variables on Wikipedia, more seem to be added every month, and we can't easily remove them. They must be loaded on every single page view even though most will be used on almost none. If this doesn't seem like a good reason to you for us to stop adding more and think of a more reasonable way to expose this info to JS, I don't know what further I can say.

#Comment by Dantman (talk | contribs)   15:52, 11 December 2009

Total size of en.wp's homepage:56,127b (54.81kb)
Total size of the script exporting the js variables on en.wp's homepage:1,866b (1.82kb)
Total size of en.wp's homepage gzipped:14,442b (14.10kb)
Total size of en.wp's homepage gzipped without js variables:13,737b (13.42kb)
Total contribution of js variables to en.wp's gzipped homepage: 705b (0.69kb)
Percentages: 3.32% uncompressed, 4.88% gzipped.
Size of variables in memory: Negligible
Cost of making scripts use the api to grab data already present on the page:
- Extra http requests that will slow down display of user script features that only use data already present
- Extra load on the servers to serve out that data (potentially uncachable)
- Desync between the data where the page ends up changed in between the time the user is served the page and the time the data is fetched from the api (iirc, issue elevated by history features in some browsers)
Cost of trying to extract the data from markup in the page:
- Extreme decrease in speed of user scripts
- Even simple user scripts will break when minor things in the skin change
- Complex ugly jQuery becomes necessary for very simple user scripts which need nothing more than a bit of js variable data

Ways to reduce the price of rendering a page:
- Stop generating markup (markup costs more bytes than js variables, and it takes longer for the server to render); Instead output MORE js variables and render those portions of the page with the api.
Cons:
- More features don't work without js.

Moving some js variables that don't change into a js file that can be cached: Minimal bytes saved. The extra http request is quite heavy, slows down the page even more than the trivial few extra variables.

^_^ If you want a way to reduce the work to serve out pages, I suggest creating an alternate browsing method that ONLY outputs JSON page data and makes the 90% of users with js enabled in their browser browser render it completely using js. THAT would save far more bytes over the network, and reduce the number of http requests required for a page. A few less js variables saves nothing, but that would make for a fast browseable Wikipedia for the large majority of people that have JS enabled (non-js users can see the current site).

#Comment by Simetrical (talk | contribs)   16:19, 11 December 2009

So it's "only" 5% of the page gzipped, great. More on smaller pages. That's more than enough that we should look at cutting it down if at all possible. Each bit of bloat may only be 5%, but if you add them all together, you get most of the page being bloat. Picking from w:Special:Random, I got w:Penrhos, Herefordshire:

  • Size of raw HTML: 25856 bytes; 8370 after gzip -1
  • Size of HTML with all <head> bloat stripped: 20274 bytes, 22% savings; 6807 after gzip -1, 19% savings

This isn't even counting HTML attribute bloat. For this purpose, I counted head bloat as including all but two script tags, all but two style tags, one rel="alternate" tag, the title, the doctype, and rel="search". My head was nine lines, and with JS/CSS combining would provide a basically identical user experience to the current head.

We need to cut bloat on a large scale. The first step is to not add more. Of course no one particular thing will make a big difference by itself, that's the definition of bloat: adding lots of little things that collectively make a significant difference. It can make a significant difference to user experience if you can start rendering content in the first TCP window instead of the second, or finish rendering the first screenful in the third window instead of the second. 1500 bytes gzipped will take up most of an initial 500ish-byte TCP window, potentially slowing down progressive rendering by the cost of a round-trip even if all essential includes are cached. Not to mention the effect on bandwidth costs.

#Comment by Dantman (talk | contribs)   19:21, 11 December 2009

We're already hurting progressive rendering by leaving the script tags in the <head> in the first place. To properly support progressive rendering you put the script tags at the very end of the body. And if script tags were there the issue of large script tags being loaded before the data for the content would be moot since all the content would be loading before the script tags even end up at the client.

#Comment by Simetrical (talk | contribs)   23:42, 12 December 2009

I concede the performance issue. You're right that we have much more important front-end performance issues. However, it just doesn't scale to add a JS var to every single page for everything that some script might want to use. We need a better system than just adding a variable for every conceivable datum someone might want. Especially when the variable's contents are actually already in the page, and only take a few lines of JavaScript to retrieve!

#Comment by Bryan (talk | contribs)   10:18, 12 December 2009

We already generate JS from Common.js and Monobook.js. Many javascript vars like wgServer can and shoul be moved there.

#Comment by Simetrical (talk | contribs)   23:25, 12 December 2009

Good point, that would cut it down a bit. And we could put user-specific variables in with the user JS.

#Comment by Tim Starling (talk | contribs)   03:49, 21 December 2009

Which users need to write JavaScript that searches the category box for a given category? If those users wrote their code as an extension instead of as a user JS hack, then the category output could be conditional on user preferences.

#Comment by Dantman (talk | contribs)   08:11, 21 December 2009

I had to write a script once to add two tabs (not important for people without js) when a page was in a specific category once. Jumping into PHP for something that little is overboard. (And it takes 3-5x as much code with all that ugly boilerplate needed)

Status & tagging log