r98698 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r98697‎ | r98698 | r98699 >
Date:17:53, 2 October 2011
Author:aaron
Status:resolved (Comments)
Tags:hashar 
Comment:
FileCache:
* Added FileCacheBase::*MissesRecent() functions for counting cache misses from different visitors.
* Made ObjectFileCache more generic.
* Cleaned up FileCacheBase::checkCacheDirs().
* Added FileCacheBase::typeSubdirectory() function and overwrote in HTMLFileCache. Fixes r98405 invalidating all existing cache due to directory change.
* Simplified FileCacheBase::checkCacheDirs() a bit

ResourceLoader:
* Use ResourceFileCache to handle load() requests, if $wgUseFileCache. Only caches requests for default language and skins. Single modules requests are always cached, whereas others require a certain threshold of traffic.
* Added ResourceFileCache class (functionality was initially to be in ObjectFileCache).
Modified paths:
  • /trunk/phase3/includes/AutoLoader.php (modified) (history)
  • /trunk/phase3/includes/cache/FileCacheBase.php (modified) (history)
  • /trunk/phase3/includes/cache/HTMLFileCache.php (modified) (history)
  • /trunk/phase3/includes/cache/ObjectFileCache.php (modified) (history)
  • /trunk/phase3/includes/cache/ResourceFileCache.php (added) (history)
  • /trunk/phase3/includes/resourceloader/ResourceLoader.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/resourceloader/ResourceLoader.php
@@ -353,8 +353,16 @@
354354 * @param $context ResourceLoaderContext: Context in which a response should be formed
355355 */
356356 public function respond( ResourceLoaderContext $context ) {
357 - global $wgCacheEpoch;
 357+ global $wgCacheEpoch, $wgUseFileCache;
358358
 359+ // Use file cache if enabled and available...
 360+ if ( $wgUseFileCache ) {
 361+ $fileCache = ResourceFileCache::newFromContext( $context );
 362+ if ( $this->tryRespondFromFileCache( $fileCache, $context ) ) {
 363+ return; // output handled
 364+ }
 365+ }
 366+
359367 // Buffer output to catch warnings. Normally we'd use ob_clean() on the
360368 // top-level output buffer to clear warnings, but that breaks when ob_gzhandler
361369 // is used: ob_clean() will clear the GZIP header in that case and it won't come
@@ -432,6 +440,18 @@
433441 ob_end_clean();
434442 echo $response;
435443
 444+ // Save response to file cache unless there are private modules or errors
 445+ if ( isset( $fileCache ) && !$private && !$exceptions && !$missing ) {
 446+ // Cache single modules...and other requests if there are enough hits
 447+ if ( ResourceFileCache::useFileCache( $context ) ) {
 448+ if ( $fileCache->isCacheWorthy() ) {
 449+ $fileCache->saveText( $response );
 450+ } else {
 451+ $fileCache->incrMissesRecent( $context->getRequest() );
 452+ }
 453+ }
 454+ }
 455+
436456 wfProfileOut( __METHOD__ );
437457 }
438458
@@ -520,6 +540,52 @@
521541 }
522542
523543 /**
 544+ * Send out code for a response from file cache if possible
 545+ *
 546+ * @param $fileCache ObjectFileCache: Cache object for this request URL
 547+ * @param $context ResourceLoaderContext: Context in which to generate a response
 548+ * @return bool If this found a cache file and handled the response
 549+ */
 550+ protected function tryRespondFromFileCache(
 551+ ResourceFileCache $fileCache, ResourceLoaderContext $context
 552+ ) {
 553+ global $wgResourceLoaderMaxage;
 554+ // Buffer output to catch warnings.
 555+ ob_start();
 556+ // Get the maximum age the cache can be
 557+ $maxage = is_null( $context->getVersion() )
 558+ ? $wgResourceLoaderMaxage['unversioned']['server']
 559+ : $wgResourceLoaderMaxage['versioned']['server'];
 560+ // Minimum timestamp the cache file must have
 561+ $good = $fileCache->isCacheGood( wfTimestamp( TS_MW, time() - $maxage ) );
 562+ if ( !$good ) {
 563+ try { // RL always hits the DB on file cache miss...
 564+ wfGetDB( DB_SLAVE );
 565+ } catch( DBConnectionError $e ) { // ...check if we need to fallback to cache
 566+ $good = $fileCache->isCacheGood(); // cache existence check
 567+ }
 568+ }
 569+ if ( $good ) {
 570+ $ts = $fileCache->cacheTimestamp();
 571+ // Send content type and cache headers
 572+ $this->sendResponseHeaders( $context, $ts, false );
 573+ // If there's an If-Modified-Since header, respond with a 304 appropriately
 574+ if ( $this->tryRespondLastModified( $context, $ts ) ) {
 575+ return; // output handled (buffers cleared)
 576+ }
 577+ $response = $fileCache->fetchText();
 578+ // Remove the output buffer and output the response
 579+ ob_end_clean();
 580+ echo $response . "\n/* Cached {$ts} */";
 581+ return true; // cache hit
 582+ }
 583+ // Clear buffer
 584+ ob_end_clean();
 585+
 586+ return false; // cache miss
 587+ }
 588+
 589+ /**
524590 * Generates code for a response
525591 *
526592 * @param $context ResourceLoaderContext: Context in which to generate a response
Index: trunk/phase3/includes/AutoLoader.php
@@ -373,6 +373,7 @@
374374 'LinkCache' => 'includes/cache/LinkCache.php',
375375 'MessageCache' => 'includes/cache/MessageCache.php',
376376 'ObjectFileCache' => 'includes/cache/ObjectFileCache.php',
 377+ 'ResourceFileCache' => 'includes/cache/ResourceFileCache.php',
377378 'SquidUpdate' => 'includes/cache/SquidUpdate.php',
378379 'TitleDependency' => 'includes/cache/CacheDependency.php',
379380 'TitleListDependency' => 'includes/cache/CacheDependency.php',
Index: trunk/phase3/includes/cache/ObjectFileCache.php
@@ -4,7 +4,7 @@
55 * @file
66 * @ingroup Cache
77 */
8 -class ObjectFileCache extends FileCacheBase {
 8+abstract class ObjectFileCache extends FileCacheBase {
99 /**
1010 * Construct an ObjectFileCache from a key and a type
1111 * @param $key string
@@ -14,38 +14,18 @@
1515 public static function newFromKey( $key, $type ) {
1616 $cache = new self();
1717
18 - $allowedTypes = self::cacheableTypes();
19 - if ( !isset( $allowedTypes[$type] ) ) {
20 - throw new MWException( "Invalid filecache type given." );
21 - }
2218 $cache->mKey = (string)$key;
2319 $cache->mType = (string)$type;
24 - $cache->mExt = $allowedTypes[$cache->mType];
 20+ $cache->mExt = 'cache';
2521
2622 return $cache;
2723 }
2824
2925 /**
30 - * Get the type => extension mapping
31 - * @return array
32 - */
33 - protected static function cacheableTypes() {
34 - return array( 'resources-js' => 'js', 'resources-css' => 'css' );
35 - }
36 -
37 - /**
3826 * Get the base file cache directory
3927 * @return string
4028 */
4129 protected function cacheDirectory() {
42 - global $wgCacheDirectory, $wgFileCacheDirectory, $wgFileCacheDepth;
43 - if ( $wgFileCacheDirectory ) {
44 - $dir = $wgFileCacheDirectory;
45 - } elseif ( $wgCacheDirectory ) {
46 - $dir = "$wgCacheDirectory/object";
47 - } else {
48 - throw new MWException( 'Please set $wgCacheDirectory in LocalSettings.php if you wish to use the HTML file cache' );
49 - }
50 - return $dir;
 30+ return $this->baseCacheDirectory() . '/object';
5131 }
5232 }
Index: trunk/phase3/includes/cache/ResourceFileCache.php
@@ -0,0 +1,84 @@
 2+<?php
 3+/**
 4+ * Contain the ResourceFileCache class
 5+ * @file
 6+ * @ingroup Cache
 7+ */
 8+class ResourceFileCache extends FileCacheBase {
 9+ protected $mCacheWorthy;
 10+
 11+ /* @TODO: configurable? */
 12+ const MISS_THRESHOLD = 360; // 6/min * 60 min
 13+
 14+ /**
 15+ * Construct an ResourceFileCache from a context
 16+ * @param $context ResourceLoaderContext
 17+ * @return ResourceFileCache
 18+ */
 19+ public static function newFromContext( ResourceLoaderContext $context ) {
 20+ $cache = new self();
 21+
 22+ if ( $context->getOnly() === 'styles' ) {
 23+ $cache->mType = $cache->mExt = 'css';
 24+ } else {
 25+ $cache->mType = $cache->mExt = 'js';
 26+ }
 27+ $modules = array_unique( $context->getModules() ); // remove duplicates
 28+ sort( $modules ); // normalize the order (permutation => combination)
 29+ $cache->mKey = sha1( $context->getHash() . implode( '|', $modules ) );
 30+ if ( count( $modules ) == 1 ) {
 31+ $cache->mCacheWorthy = true; // won't take up much space
 32+ }
 33+
 34+ return $cache;
 35+ }
 36+
 37+ /**
 38+ * Check if an RL request can be cached.
 39+ * Caller is responsible for checking if any modules are private.
 40+ * @param $context ResourceLoaderContext
 41+ * @return bool
 42+ */
 43+ public static function useFileCache( ResourceLoaderContext $context ) {
 44+ global $wgUseFileCache, $wgDefaultSkin, $wgLanguageCode;
 45+ if ( !$wgUseFileCache ) {
 46+ return false;
 47+ }
 48+ // Get all query values
 49+ $queryVals = $context->getRequest()->getValues();
 50+ foreach ( $queryVals as $query => $val ) {
 51+ if ( $query === 'modules' || $query === '*' ) { // &* added as IE fix
 52+ continue;
 53+ } elseif ( $query === 'skin' && $val === $wgDefaultSkin ) {
 54+ continue;
 55+ } elseif ( $query === 'lang' && $val === $wgLanguageCode ) {
 56+ continue;
 57+ } elseif ( $query === 'only' && in_array( $val, array( 'styles', 'scripts' ) ) ) {
 58+ continue;
 59+ } elseif ( $query === 'debug' && $val === 'false' ) {
 60+ continue;
 61+ }
 62+ return false;
 63+ }
 64+ return true; // cacheable
 65+ }
 66+
 67+ /**
 68+ * Get the base file cache directory
 69+ * @return string
 70+ */
 71+ protected function cacheDirectory() {
 72+ return $this->baseCacheDirectory() . '/resources';
 73+ }
 74+
 75+ /**
 76+ * Recent cache misses
 77+ * @return bool
 78+ */
 79+ public function isCacheWorthy() {
 80+ if ( $this->mCacheWorthy === null ) {
 81+ $this->mCacheWorthy = ( $this->getMissesRecent() >= self::MISS_THRESHOLD );
 82+ }
 83+ return $this->mCacheWorthy;
 84+ }
 85+}
Property changes on: trunk/phase3/includes/cache/ResourceFileCache.php
___________________________________________________________________
Added: svn:eol-style
186 + native
Index: trunk/phase3/includes/cache/FileCacheBase.php
@@ -6,19 +6,37 @@
77 */
88 abstract class FileCacheBase {
99 protected $mKey;
10 - protected $mType;
11 - protected $mExt;
 10+ protected $mType = 'object';
 11+ protected $mExt = 'cache';
1212 protected $mFilePath;
1313 protected $mUseGzip;
1414
 15+ /* @TODO: configurable? */
 16+ const MISS_FACTOR = 10; // log 1 every MISS_FACTOR cache misses
 17+
1518 protected function __construct() {
1619 global $wgUseGzip;
1720
1821 $this->mUseGzip = (bool)$wgUseGzip;
19 - $this->mExt = 'cache';
2022 }
2123
2224 /**
 25+ * Get the base file cache directory
 26+ * @return string
 27+ */
 28+ final protected function baseCacheDirectory() {
 29+ global $wgCacheDirectory, $wgFileCacheDirectory, $wgFileCacheDepth;
 30+ if ( $wgFileCacheDirectory ) {
 31+ $dir = $wgFileCacheDirectory;
 32+ } elseif ( $wgCacheDirectory ) {
 33+ $dir = $wgCacheDirectory;
 34+ } else {
 35+ throw new MWException( 'Please set $wgCacheDirectory in LocalSettings.php if you wish to use the HTML file cache' );
 36+ }
 37+ return $dir;
 38+ }
 39+
 40+ /**
2341 * Get the base cache directory (not speficic to this file)
2442 * @return string
2543 */
@@ -34,7 +52,8 @@
3553 }
3654
3755 $dir = $this->cacheDirectory();
38 - $subDirs = $this->mType . '/' . $this->hashSubdirectory(); // includes '/'
 56+ # Build directories (methods include the trailing "/")
 57+ $subDirs = $this->typeSubdirectory() . $this->hashSubdirectory();
3958 # Avoid extension confusion
4059 $key = str_replace( '.', '%2E', urlencode( $this->mKey ) );
4160 # Build the full file path
@@ -112,6 +131,7 @@
113132 */
114133 public function saveText( $text ) {
115134 global $wgUseFileCache;
 135+
116136 if ( !$wgUseFileCache ) {
117137 return false;
118138 }
@@ -121,7 +141,7 @@
122142 }
123143
124144 $this->checkCacheDirs(); // build parent dir
125 - if ( !file_put_contents( $this->cachePath(), $text ) ) {
 145+ if ( !file_put_contents( $this->cachePath(), $text, LOCK_EX ) ) {
126146 return false;
127147 }
128148
@@ -140,21 +160,23 @@
141161
142162 /**
143163 * Create parent directors of $this->cachePath()
144 - * @TODO: why call wfMkdirParents() twice?
145164 * @return void
146165 */
147166 protected function checkCacheDirs() {
148 - $filename = $this->cachePath();
149 - $mydir2 = substr( $filename, 0, strrpos( $filename, '/') ); # subdirectory level 2
150 - $mydir1 = substr( $mydir2, 0, strrpos( $mydir2, '/') ); # subdirectory level 1
 167+ wfMkdirParents( dirname( $this->cachePath() ), null, __METHOD__ );
 168+ }
151169
152 - wfMkdirParents( $mydir1, null, __METHOD__ );
153 - wfMkdirParents( $mydir2, null, __METHOD__ );
 170+ /**
 171+ * Get the cache type subdirectory (with trailing slash) or the empty string
 172+ * @return string
 173+ */
 174+ protected function typeSubdirectory() {
 175+ return $this->mType . '/';
154176 }
155177
156178 /**
157 - * Return relative multi-level hash subdirectory with the trailing
158 - * slash or the empty string if $wgFileCacheDepth is off
 179+ * Return relative multi-level hash subdirectory (with trailing slash)
 180+ * or the empty string if not $wgFileCacheDepth
159181 * @return string
160182 */
161183 protected function hashSubdirectory() {
@@ -170,4 +192,55 @@
171193
172194 return $subdir;
173195 }
 196+
 197+ /**
 198+ * Roughly increments the cache misses in the last hour by unique visitors
 199+ * @param $request WebRequest
 200+ * @return void
 201+ */
 202+ public function incrMissesRecent( WebRequest $request ) {
 203+ global $wgMemc;
 204+ if ( mt_rand( 0, self::MISS_FACTOR - 1 ) == 0 ) {
 205+ # Get an large IP range that should include the user
 206+ # even if that person's IP address changes...
 207+ $ip = $request->getIP();
 208+ if ( !IP::isValid( $ip ) ) {
 209+ return;
 210+ }
 211+ $ip = IP::isIPv6( $ip )
 212+ ? IP::sanitizeRange( "$ip/64" )
 213+ : IP::sanitizeRange( "$ip/16" );
 214+
 215+ # Bail out if a request already came from this range...
 216+ $key = wfMemcKey( get_class( $this ), 'attempt', $this->mType, $this->mKey, $ip );
 217+ if ( $wgMemc->get( $key ) ) {
 218+ return; // possibly the same user
 219+ }
 220+ $wgMemc->set( $key, 1, 3600 );
 221+
 222+ # Increment the number of cache misses...
 223+ $key = $this->cacheMissKey();
 224+ if ( $wgMemc->get( $key ) === false ) {
 225+ $wgMemc->set( $key, 1, 3600 );
 226+ } else {
 227+ $wgMemc->incr( $key );
 228+ }
 229+ }
 230+ }
 231+
 232+ /**
 233+ * Roughly gets the cache misses in the last hour by unique visitors
 234+ * @return int
 235+ */
 236+ public function getMissesRecent() {
 237+ global $wgMemc;
 238+ return self::MISS_FACTOR * $wgMemc->get( $this->cacheMissKey() );
 239+ }
 240+
 241+ /**
 242+ * @return string
 243+ */
 244+ protected function cacheMissKey() {
 245+ return wfMemcKey( get_class( $this ), 'misses', $this->mType, $this->mKey );
 246+ }
174247 }
Index: trunk/phase3/includes/cache/HTMLFileCache.php
@@ -35,6 +35,7 @@
3636
3737 /**
3838 * Get the base file cache directory
 39+ * Note: avoids baseCacheDirectory() for b/c to not skip existing cache
3940 * @return string
4041 */
4142 protected function cacheDirectory() {
@@ -50,6 +51,18 @@
5152 }
5253
5354 /**
 55+ * Get the cache type subdirectory (with the trailing slash) or the empty string
 56+ * @return string
 57+ */
 58+ protected function typeSubdirectory() {
 59+ if ( $this->mType === 'view' ) {
 60+ return ''; // b/c to not skip existing cache
 61+ } else {
 62+ return $this->mType . '/';
 63+ }
 64+ }
 65+
 66+ /**
5467 * Check if pages can be cached for this request/user
5568 * @param $context IContextSource
5669 * @return bool
@@ -71,9 +84,8 @@
7285 // Below are header setting params
7386 } elseif ( $query == 'maxage' || $query == 'smaxage' ) {
7487 continue;
75 - } else {
76 - return false;
7788 }
 89+ return false;
7890 }
7991 $user = $context->getUser();
8092 // Check for non-standard user language; this covers uselang,

Follow-up revisions

RevisionCommit summaryAuthorDate
r98705* Added isCacheWorthy() optimization (checks if the file exists, stale or not)...aaron19:44, 2 October 2011
r98739Follow-up r98698: cache urls with 'version' param...which is kind if importantaaron05:32, 3 October 2011
r101044The early return added in r98698 was skipping the closing wfProfileOut().platonides20:50, 27 October 2011
r106500FU r98698: Show any PHP warnings in tryRespondFromFileCache() in debug modeaaron05:02, 17 December 2011
r107137Follow-up r98698: Reduce amount of ipv6 spam in FileCacheBase::incrMissesRece...aaron07:41, 23 December 2011
r110858add up comment for FileCache rewrite (r98698)hashar17:56, 7 February 2012

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r98405HTMLFileCache refactoring:...aaron08:18, 29 September 2011

Comments

#Comment by Catrope (talk | contribs)   16:02, 3 October 2011
+			// Remove the output buffer and output the response
+			ob_end_clean();

You're not gonna output the warnings and such you gathered in the buffer?

ResourceLoader bits look good to me otherwise.

#Comment by Tim Starling (talk | contribs)   06:55, 23 December 2011
	# Get a large IP range that should include the user  even if that 
	# person's IP address changes
...
	$ip = IP::isIPv6( $ip )
		? IP::sanitizeRange( "$ip/64" )
		: IP::sanitizeRange( "$ip/16" );

You probably want /32 here. /64 is the smallest possible allocation larger than a single address, and so if the ISPs have any sense, every residential customer will get a /64 subnet. Some day every user may have a different /64, and that's a lot of memcached spam with this code. /32 is probably the closest equivalent to IPv4 /16.

Status & tagging log