r69746 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r69745‎ | r69746 | r69747 >
Date:19:43, 22 July 2010
Author:peter17
Status:ok (Comments)
Tags:
Comment:
Batch caching for API-retrieved templates, as requested in r69730
Modified paths:
  • /branches/iwtransclusion/phase3/includes/Interwiki.php (modified) (history)

Diff [purge]

Index: branches/iwtransclusion/phase3/includes/Interwiki.php
@@ -278,41 +278,45 @@
279279 } else if( $transAPI !== '' ) {
280280
281281 $fullTitle = $title->getNsText().':'.$title->getText();
282 -
283 - $url1 = $transAPI."?action=query&prop=revisions&titles=$fullTitle&rvprop=content&format=json";
284 -
285 - if ( strlen( $url1 ) > 255 ) {
286 - return false;
287 - }
288282
289 - $finalText = self::fetchTemplateHTTPMaybeFromCache( $url1 );
 283+ $finalText = self::fetchTemplateFromAPI( $wikiID, $transAPI, $fullTitle );
290284
291 - $url2 = $transAPI."?action=parse&text={{".$fullTitle."}}&prop=templates&format=json";
 285+ //Retrieve the list of subtemplates
 286+ $url2 = wfAppendQuery( $transAPI,
 287+ array( 'action' => 'query',
 288+ 'titles' => $fullTitle,
 289+ 'prop' => 'templates',
 290+ 'format' => 'json'
 291+ )
 292+ );
292293
293294 $get = Http::get( $url2 );
294295 $myArray = FormatJson::decode($get, true);
295296
296 - if ( ! empty( $myArray['parse'] )) {
297 - $templates = $myArray['parse']['templates'];
 297+ if ( ! empty( $myArray['query'] )) {
 298+ if ( ! empty( $myArray['query']['pages'] )) {
 299+ $templates = array_pop( $myArray['query']['pages'] );
 300+ if ( ! empty( $templates['templates'] )) {
 301+ $templates = $templates['templates'];
 302+ } else {
 303+ $templates = array( );
 304+ }
 305+ }
298306 }
299307
300 - // TODO: The templates are retrieved one by one. We should get them all in 1 request
 308+ // TODO: The subtemplates are retrieved one by one. We should get them all in 1 request
301309 // Here, we preload and cache the subtemplates
302 - for ($i = 0 ; $i < count( $templates ) ; $i++) {
303 - $newTitle = $templates[$i]['*'];
304 -
305 - $url = $transAPI."?action=query&prop=revisions&titles=$newTitle&rvprop=content&format=json";
306 -
307 - // $newText is unused, but requesting it will put the template in the cache
308 - $newText = self::fetchTemplateHTTPMaybeFromCache( $url );
309 -
310 - }
 310+ self::cacheTemplatesFromAPI( $wikiID, $transAPI, $templates );
 311+
311312 return $finalText;
312313
313314 }
314315 return false;
315316 }
316317
 318+ /**
 319+ * Retrieve the wikitext of a distant page accessing the foreign DB
 320+ */
317321 public static function fetchTemplateFromDB ( $wikiID, $title ) {
318322
319323 $revision = Revision::loadFromTitleForeignWiki( $wikiID, $title );
@@ -325,43 +329,77 @@
326330 return false;
327331 }
328332
329 - public static function fetchTemplateHTTPMaybeFromCache( $url ) {
330 - global $wgTranscludeCacheExpiry;
331 - $dbr = wfGetDB( DB_SLAVE );
332 - $tsCond = $dbr->timestamp( time() - $wgTranscludeCacheExpiry );
333 - $obj = $dbr->selectRow( 'transcache', array('tc_time', 'tc_contents' ),
334 - array( 'tc_url' => $url, "tc_time >= " . $dbr->addQuotes( $tsCond ) ) );
335 -
336 - if ( $obj ) {
337 - return $obj->tc_contents;
 333+ /**
 334+ * Retrieve the wikitext of a distant page using the API of the foreign wiki
 335+ */
 336+ public static function fetchTemplateFromAPI( $wikiID, $transAPI, $fullTitle ) {
 337+ global $wgMemc, $wgTranscludeCacheExpiry;
 338+
 339+ $key = wfMemcKey( 'iwtransclustiontext', 'textid', $wikiID, $fullTitle );
 340+ $text = $wgMemc->get( $key );
 341+ if( $text ){
 342+ return $text;
338343 }
339 -
 344+
 345+ $url = wfAppendQuery(
 346+ $transAPI,
 347+ array( 'action' => 'query',
 348+ 'titles' => $fullTitle,
 349+ 'prop' => 'revisions',
 350+ 'rvprop' => 'content',
 351+ 'format' => 'json'
 352+ )
 353+ );
 354+
340355 $get = Http::get( $url );
341 -
342356 $content = FormatJson::decode( $get, true );
343357
344358 if ( ! empty($content['query']['pages']) ) {
345359
346360 $page = array_pop( $content['query']['pages'] );
347361 $text = $page['revisions'][0]['*'];
348 -
349 - } else {
350 -
351 - return wfMsg( 'scarytranscludefailed', $url );
352 -
 362+ $wgMemc->set( $key, $text, $wgTranscludeCacheExpiry );
 363+ return $text;
353364 }
354 -
355 - $dbw = wfGetDB( DB_MASTER );
356 - $dbw->replace( 'transcache', array('tc_url'), array(
357 - 'tc_url' => $url,
358 - 'tc_time' => $dbw->timestamp( ),
359 - 'tc_contents' => $text)
360 - );
361 -
362 - return $text;
 365+ return false;
363366 }
364367
 368+ public static function cacheTemplatesFromAPI( $wikiID, $transAPI, $titles ){
 369+ global $wgMemc, $wgTranscludeCacheExpiry;
 370+
 371+ $outdatedTitles = array( );
 372+
 373+ foreach( $titles as $title ){
 374+ $key = wfMemcKey( 'iwtransclustiontext', 'textid', $wikiID, $title['title'] );
 375+ $text = $wgMemc->get( $key );
 376+ if( !$text ){
 377+ $outdatedTitles[] = $title['title'];
 378+ }
 379+ }
 380+
 381+ $batches = array_chunk( $outdatedTitles, 50 );
 382+
 383+ foreach( $batches as $batch ){
 384+ $url = wfAppendQuery(
 385+ $transAPI,
 386+ array( 'action' => 'query',
 387+ 'titles' => implode( '|', $batch ),
 388+ 'prop' => 'revisions',
 389+ 'rvprop' => 'content',
 390+ 'format' => 'json'
 391+ )
 392+ );
 393+ $get = Http::get( $url );
 394+ $content = FormatJson::decode( $get, true );
 395+
 396+ if ( ! empty($content['query']['pages']) ) {
 397+ foreach( $content['query']['pages'] as $page ) {
 398+ $key = wfMemcKey( 'iwtransclustiontext', 'textid', $wikiID, $page['title'] );
 399+ $text = $page['revisions'][0]['*'];
 400+ $wgMemc->set( $key, $text, $wgTranscludeCacheExpiry );
 401+ }
 402+ }
 403+ }
 404+ }
365405
366 -
367 -
368406 }

Follow-up revisions

RevisionCommit summaryAuthorDate
r69781Negative caching of API-retrieved templates (as requested in r69746) + prefix...peter1709:11, 23 July 2010
r87106Merge r69745, r69746, r69781, r69783reedy00:28, 29 April 2011
r92987Merge r87106 which is a Merge r69745, r69746, r69781, r69783reedy17:25, 24 July 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r69730iwtransclusion code updated according to the remarks about r69723, except abo...peter1714:14, 22 July 2010

Comments

#Comment by Catrope (talk | contribs)   20:31, 22 July 2010

Looks good generally. A few minor points:

  • This could use some isset() checks to prevent notices about array indices not being set when you access them.
  • You're not doing negative caching, i.e. you're not caching the fact that a template is missing. This means missing templates will be rerequested every time. I'm not sure what you'd use to indicate missing status, since false and null are both used to indicate a cache miss (by different backends; yes, this sucks, but that's the way it is currently) and the empty string (or any string, really) might very well just be the template contents. The only other alternatives I can think of are things like true, an array or an object, but those aren't very nice. I'll discuss this and look for examples elsewhere in MW. Maybe the nicest thing to go with for now is something like array( 'missing' => true )

Status & tagging log