r86179 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r86178‎ | r86179 | r86180 >
Date:03:39, 16 April 2011
Author:aaron
Status:ok (Comments)
Tags:schema 
Comment:
Overdue changes to `flaggedrevs` table:
* Removed unused ft_text and fr_comment fields
* Changed the PK to just rev_id (with an index on old PK fields)
* Added fr_rev_timestamp field with indexes
* Added a script to populate fr_rev_timestamp
Modified paths:
  • /trunk/extensions/FlaggedRevs/dataclasses/FlaggedRevision.php (modified) (history)
  • /trunk/extensions/FlaggedRevs/maintenance/populateRevTimestamp.inc (added) (history)
  • /trunk/extensions/FlaggedRevs/maintenance/populateRevTimestamp.php (added) (history)
  • /trunk/extensions/FlaggedRevs/schema/FlaggedRevsUpdater.hooks.php (modified) (history)
  • /trunk/extensions/FlaggedRevs/schema/mysql/FlaggedRevs.sql (modified) (history)
  • /trunk/extensions/FlaggedRevs/schema/mysql/patch-fr_page_rev-index.sql (added) (history)
  • /trunk/extensions/FlaggedRevs/schema/postgres/FlaggedRevs.pg.sql (modified) (history)
  • /trunk/extensions/FlaggedRevs/schema/postgres/patch-fr_page_rev-index.sql (added) (history)

Diff [purge]

Index: trunk/extensions/FlaggedRevs/schema/mysql/patch-fr_page_rev-index.sql
@@ -0,0 +1,13 @@
 2+ALTER TABLE /*_*/flaggedrevs
 3+ ADD COLUMN fr_rev_timestamp varbinary(14) NOT NULL default '',
 4+ DROP COLUMN fr_text,
 5+ DROP COLUMN fr_comment,
 6+-- Old (fr_page_id,fr_rev_id) key
 7+ DROP PRIMARY KEY;
 8+
 9+-- Take the first row of any duplicates on new key
 10+ALTER IGNORE TABLE /*_*/flaggedrevs ADD PRIMARY KEY (fr_rev_id);
 11+
 12+CREATE INDEX /*i*/page_rev ON /*_*/flaggedrevs (fr_page_id,fr_rev_id);
 13+CREATE INDEX /*i*/page_time ON /*_*/flaggedrevs (fr_page_id,fr_rev_timestamp);
 14+CREATE INDEX /*i*/page_qal_time ON /*_*/flaggedrevs (fr_page_id,fr_quality,fr_rev_timestamp);
Index: trunk/extensions/FlaggedRevs/schema/mysql/FlaggedRevs.sql
@@ -1,5 +1,4 @@
 2+-- (c) Aaron Schulz, 2007-2011, GPL
23 -- Replace /*_*/ with the proper prefix
34 -- Replace /*$wgDBTableOptions*/ with the correct options
45
@@ -36,47 +35,46 @@
3736 PRIMARY KEY (fpp_page_id,fpp_quality)
3837 ) /*$wgDBTableOptions*/;
3938
40 -CREATE INDEX /*i*/fpp_quality_pending ON /*_*/flaggedpage_pending (fpp_quality,fpp_pending_since);
 39+CREATE INDEX /*i*/fpp_quality_pending
 40+ ON /*_*/flaggedpage_pending (fpp_quality,fpp_pending_since);
4141
4242 -- This stores all of our revision reviews; it is the main table
4343 -- The template/file version data is stored in the next two tables
4444 CREATE TABLE IF NOT EXISTS /*_*/flaggedrevs (
 45+ -- Foreign key to revision.rev_id
 46+ fr_rev_id integer unsigned NOT NULL PRIMARY KEY,
 47+ -- Timestamp of revision reviewed (revision.rev_timestamp)
 48+ fr_rev_timestamp varbinary(14) NOT NULL default '',
4549 -- Foreign key to page.page_id
4650 fr_page_id integer unsigned NOT NULL,
47 - -- Foreign key to revision.rev_id
48 - fr_rev_id integer unsigned NOT NULL,
4951 -- Foreign key to user.user_id
5052 fr_user integer unsigned NOT NULL,
5153 -- Timestamp of review
5254 fr_timestamp varbinary(14) NOT NULL,
53 - -- Review notes
54 - fr_comment mediumblob NOT NULL,
55 - -- The quality tier (0=checked, 1=quality, 2=pristine)
 55+ -- Store the precedence level
5656 fr_quality tinyint(1) NOT NULL default 0,
5757 -- Store tag metadata as newline separated,
5858 -- colon separated tag:value pairs
5959 fr_tags mediumblob NOT NULL,
60 - -- Store the text with all transclusions resolved
61 - -- This will trade space for speed
62 - fr_text mediumblob NOT NULL,
6360 -- Comma-separated list of flags:
6461 -- dynamic: no text, templates must be fetched
65 - -- auto: revision patrolled automatically
 62+ -- auto: revision reviewed automatically
6663 -- utf8: in UTF-8
6764 fr_flags tinyblob NOT NULL,
68 - -- Parameters for revisions of Image pages:
 65+ -- Parameters for revisions of File pages:
6966 -- Name of included image (NULL if n/a)
7067 fr_img_name varchar(255) binary NULL default NULL,
7168 -- Timestamp of file (when uploaded) (NULL if n/a)
7269 fr_img_timestamp varbinary(14) NULL default NULL,
7370 -- Statistically unique SHA-1 key (NULL if n/a)
74 - fr_img_sha1 varbinary(32) NULL default NULL,
75 -
76 - PRIMARY KEY (fr_page_id,fr_rev_id)
 71+ fr_img_sha1 varbinary(32) NULL default NULL
7772 ) /*$wgDBTableOptions*/;
7873
 74+CREATE INDEX /*i*/page_rev ON /*_*/flaggedrevs (fr_page_id,fr_rev_id);
 75+CREATE INDEX /*i*/page_time ON /*_*/flaggedrevs (fr_page_id,fr_rev_timestamp);
 76+CREATE INDEX /*i*/page_qal_rev ON /*_*/flaggedrevs (fr_page_id,fr_quality,fr_rev_id);
 77+CREATE INDEX /*i*/page_qal_time ON /*_*/flaggedrevs (fr_page_id,fr_quality,fr_rev_timestamp);
7978 CREATE INDEX /*i*/fr_img_sha1 ON /*_*/flaggedrevs (fr_img_sha1);
80 -CREATE INDEX /*i*/page_qal_rev ON /*_*/flaggedrevs (fr_page_id,fr_quality,fr_rev_id);
8179
8280 -- This stores all of our transclusion revision pointers
8381 CREATE TABLE IF NOT EXISTS /*_*/flaggedtemplates (
@@ -129,8 +127,10 @@
130128 ftr_title varchar(255) binary NOT NULL default ''
131129 ) /*$wgDBTableOptions*/;
132130
133 -CREATE UNIQUE INDEX /*i*/from_namespace_title ON /*_*/flaggedrevs_tracking (ftr_from,ftr_namespace,ftr_title);
134 -CREATE INDEX /*i*/namespace_title_from ON /*_*/flaggedrevs_tracking (ftr_namespace,ftr_title,ftr_from);
 131+CREATE UNIQUE INDEX /*i*/from_namespace_title
 132+ ON /*_*/flaggedrevs_tracking (ftr_from,ftr_namespace,ftr_title);
 133+CREATE INDEX /*i*/namespace_title_from
 134+ ON /*_*/flaggedrevs_tracking (ftr_namespace,ftr_title,ftr_from);
135135
136136 -- This stores user demotions and stats
137137 CREATE TABLE IF NOT EXISTS /*_*/flaggedrevs_promote (
Index: trunk/extensions/FlaggedRevs/schema/FlaggedRevsUpdater.hooks.php
@@ -33,6 +33,8 @@
3434 'flaggedrevs_stats', "$base/patch-flaggedrevs_stats.sql", true ) );
3535 $du->addExtensionUpdate( array( 'FlaggedRevsUpdaterHooks::doFlaggedImagesTimestampNULL',
3636 "$base/patch-fi_img_timestamp.sql" ) );
 37+ $du->addExtensionUpdate( array( 'addIndex',
 38+ 'flaggedrevs', 'page_rev', "$base/patch-fr_page_rev-index.sql", true ) );
3739 } elseif ( $wgDBtype == 'postgres' ) {
3840 $base = dirname( __FILE__ ) . '/postgres';
3941 // Initial install tables (current schema)
@@ -60,6 +62,8 @@
6163 // @TODO: PG stats table???
6264 $du->addExtensionUpdate( array( 'FlaggedRevsUpdaterHooks::doFlaggedImagesTimestampNULL',
6365 "$base/patch-fi_img_timestamp.sql" ) );
 66+ $du->addExtensionUpdate( array( 'addIndex',
 67+ 'flaggedrevs', 'page_rev', "$base/patch-fr_page_rev-index.sql", true ) );
6468 } elseif ( $wgDBtype == 'sqlite' ) {
6569 $base = dirname( __FILE__ ) . '/mysql';
6670 $du->addExtensionUpdate( array( 'addTable',
Index: trunk/extensions/FlaggedRevs/schema/postgres/patch-fr_page_rev-index.sql
@@ -0,0 +1,11 @@
 2+-- Add file metadata for flaggedrevs of image pages
 3+ALTER TABLE flaggedrevs
 4+ ADD COLUMN fr_rev_timestamp TIMESTAMPTZ NOT NULL,
 5+ DROP COLUMN fr_text;
 6+ DROP COLUMN fr_comment;
 7+ DROP PRIMARY KEY;
 8+ ADD PRIMARY KEY (fr_rev_id);
 9+
 10+CREATE INDEX page_rev ON flaggedrevs (fr_page_id,fr_rev_id);
 11+CREATE INDEX page_time ON flaggedrevs (fr_page_id,fr_rev_timestamp);
 12+CREATE INDEX page_qal_time ON flaggedrevs (fr_page_id,fr_quality,fr_rev_timestamp);
Index: trunk/extensions/FlaggedRevs/schema/postgres/FlaggedRevs.pg.sql
@@ -25,21 +25,23 @@
2626 CREATE INDEX fpp_quality_pending ON flaggedpage_pending (fpp_quality,fpp_pending_since);
2727
2828 CREATE TABLE flaggedrevs (
 29+ fr_rev_id BIGINT NOT NULL DEFAULT 0,
 30+ fr_rev_timestamp TIMESTAMPTZ NOT NULL,
2931 fr_page_id BIGINT NOT NULL DEFAULT 0,
30 - fr_rev_id BIGINT NOT NULL DEFAULT 0,
3132 fr_user BIGINT NULL REFERENCES mwuser(user_id) ON DELETE SET NULL,
3233 fr_timestamp TIMESTAMPTZ,
33 - fr_comment TEXT NOT NULL DEFAULT '',
3434 fr_quality INTEGER NOT NULL DEFAULT 0,
3535 fr_tags TEXT NOT NULL DEFAULT '',
36 - fr_text TEXT NOT NULL DEFAULT '',
3736 fr_flags TEXT NOT NULL,
3837 fr_img_name TEXT NULL DEFAULT NULL,
3938 fr_img_timestamp TIMESTAMPTZ NULL DEFAULT NULL,
4039 fr_img_sha1 TEXT NULL DEFAULT NULL,
41 - PRIMARY KEY (fr_page_id,fr_rev_id)
 40+ PRIMARY KEY (fr_rev_id)
4241 );
 42+CREATE INDEX page_rev ON flaggedrevs (fr_page_id,fr_rev_id);
 43+CREATE INDEX page_time ON flaggedrevs (fr_page_id,fr_rev_timestamp);
4344 CREATE INDEX page_qal_rev ON flaggedrevs (fr_page_id,fr_quality,fr_rev_id);
 45+CREATE INDEX page_qal_time ON flaggedrevs (fr_page_id,fr_quality,fr_rev_timestamp);
4446 CREATE INDEX fr_img_sha1 ON flaggedrevs (fr_img_sha1);
4547
4648 CREATE TABLE flaggedtemplates (
Index: trunk/extensions/FlaggedRevs/maintenance/populateRevTimestamp.php
@@ -0,0 +1,33 @@
 2+<?php
 3+
 4+if ( getenv( 'MW_INSTALL_PATH' ) ) {
 5+ $IP = getenv( 'MW_INSTALL_PATH' );
 6+} else {
 7+ $IP = dirname(__FILE__).'/../../..';
 8+}
 9+
 10+$options = array( 'help', 'startrev' );
 11+require "$IP/maintenance/commandLine.inc";
 12+require dirname(__FILE__) . '/populateRevTimestamp.inc';
 13+
 14+if ( isset($options['help']) ) {
 15+ echo <<<TEXT
 16+Purpose:
 17+ Populates fr_rev_timestamp column in the flaggedrevs table.
 18+Usage:
 19+ php populateRevTimestamp.php --help
 20+ php populateRevTimestamp.php [--startrev <ID>]
 21+
 22+ --help : This help message
 23+ --<ID> : The ID of the starting rev
 24+
 25+TEXT;
 26+ exit(0);
 27+}
 28+
 29+error_reporting( E_ALL );
 30+
 31+$startRev = isset( $options['startrev'] ) ?
 32+ (int)$options['startrev'] : null;
 33+
 34+populate_fr_rev_timestamp( $startRev );
Property changes on: trunk/extensions/FlaggedRevs/maintenance/populateRevTimestamp.php
___________________________________________________________________
Added: svn:eol-style
135 + native
Index: trunk/extensions/FlaggedRevs/maintenance/populateRevTimestamp.inc
@@ -0,0 +1,63 @@
 2+<?php
 3+
 4+function populate_fr_rev_timestamp( $start = null ) {
 5+ echo "Populating and correcting flaggedrevs columns\n";
 6+
 7+ $BATCH_SIZE = 1000;
 8+
 9+ $db = wfGetDB( DB_MASTER );
 10+
 11+ if ( $start === null ) {
 12+ $start = $db->selectField( 'flaggedrevs', 'MIN(fr_rev_id)', false, __FUNCTION__ );
 13+ }
 14+ $end = $db->selectField( 'flaggedrevs', 'MAX(fr_rev_id)', false, __FUNCTION__ );
 15+ if ( is_null( $start ) || is_null( $end ) ) {
 16+ echo "...flaggedrevs table seems to be empty.\n";
 17+ return;
 18+ }
 19+ # Do remaining chunk
 20+ $end += $BATCH_SIZE - 1;
 21+ $blockStart = $start;
 22+ $blockEnd = $start + $BATCH_SIZE - 1;
 23+ $count = 0;
 24+ $changed = 0;
 25+ while ( $blockEnd <= $end ) {
 26+ echo "...doing fr_rev_id from $blockStart to $blockEnd\n";
 27+ $cond = "fr_rev_id BETWEEN $blockStart AND $blockEnd AND fr_rev_timestamp = ''";
 28+ $res = $db->select(
 29+ array( 'flaggedrevs', 'revision', 'archive' ),
 30+ array( 'fr_rev_id', 'rev_timestamp', 'ar_timestamp' ),
 31+ $cond,
 32+ __FUNCTION__,
 33+ array(),
 34+ array( 'revision' => array( 'LEFT JOIN', 'rev_id = fr_rev_id' ),
 35+ 'archive' => array( 'LEFT JOIN', 'ar_rev_id = fr_rev_id' ) ) // non-unique but OK
 36+ );
 37+ $db->begin();
 38+ # Go through and clean up missing items, as well as correct fr_quality...
 39+ foreach ( $res as $row ) {
 40+ $timestamp = '';
 41+ if ( $row->rev_timestamp ) {
 42+ $timestamp = $row->rev_timestamp;
 43+ } elseif ( $row->ar_timestamp ) {
 44+ $timestamp = $row->ar_timestamp;
 45+ }
 46+ if ( $timestamp != '' ) {
 47+ # Update the row...
 48+ $db->update( 'flaggedrevs',
 49+ array( 'fr_rev_timestamp' => $timestamp ),
 50+ array( 'fr_rev_id' => $row->fr_rev_id ),
 51+ __FUNCTION__
 52+ );
 53+ $changed++;
 54+ }
 55+ $count++;
 56+ }
 57+ $db->commit();
 58+ $db->freeResult( $res );
 59+ $blockStart += $BATCH_SIZE;
 60+ $blockEnd += $BATCH_SIZE;
 61+ wfWaitForSlaves( 5 );
 62+ }
 63+ echo "fr_rev_timestamp columns update complete ... {$count} rows [{$changed} changed]\n";
 64+}
Property changes on: trunk/extensions/FlaggedRevs/maintenance/populateRevTimestamp.inc
___________________________________________________________________
Added: svn:eol-style
165 + native
Index: trunk/extensions/FlaggedRevs/dataclasses/FlaggedRevision.php
@@ -296,12 +296,11 @@
297297 $revRow = array(
298298 'fr_page_id' => $this->getPage(),
299299 'fr_rev_id' => $this->getRevId(),
 300+ 'fr_rev_timestamp' => $this->getRevTimestamp(),
300301 'fr_user' => $this->getUser(),
301302 'fr_timestamp' => $dbw->timestamp( $this->getTimestamp() ),
302 - 'fr_comment' => '', # not used anymore
303303 'fr_quality' => $this->getQuality(),
304304 'fr_tags' => self::flattenRevisionTags( $this->getTags() ),
305 - 'fr_text' => '', # not used anymore
306305 'fr_flags' => implode( ',', $this->mFlags ),
307306 'fr_img_name' => $this->getFileName(),
308307 'fr_img_timestamp' => $dbw->timestampOrNull( $this->getFileTimestamp() ),

Follow-up revisions

RevisionCommit summaryAuthorDate
r87606Follow-up r86179: Updated and commented out two USE INDEX statements. This *h...aaron23:44, 6 May 2011
r87616* Follow-up r86179: code changes to make use of index changes. Uses timestamp...aaron00:45, 7 May 2011
r87663Follow-up r86179: use new PK in delete() conditionaaron21:34, 7 May 2011
r87669Run populate_fr_rev_timestamp() via updateraaron00:58, 8 May 2011
r87690Added missing $dbw->timestamp() format call for new fr_rev_timestamp fieldaaron18:49, 8 May 2011
r93036Added simple position tracking to populateRevTimestamp.php scriptaaron02:58, 25 July 2011
r98569Fixed totally broken query plan for anon edit stats....aaron21:31, 30 September 2011

Comments

#Comment by Bryan (talk | contribs)   08:09, 18 April 2011

svn:eol-style on the .sql files.

#Comment by Aaron Schulz (talk | contribs)   12:52, 18 April 2011
#Comment by Aaron Schulz (talk | contribs)   22:18, 1 May 2011

Code changes still needed to make use of this. That will come soon.

#Comment by Aaron Schulz (talk | contribs)   10:26, 23 July 2011

Deploy plan:

  • Merge and sync r87606 (removes some FORCE INDEX statements related to PK)
  • Run path-fr_page_rev-index.sql but WITHOUT the "DROP COLUMN" statements (includes PK change)
  • Copy populateRevTimestamp.php to cluster and run it, recoding the ending fr_rev_id (say LAST_REV)
  • SVN update FR to 1.18 but don't scap
  • Set $wgReadOnly = "Performing brief site maintenance, this should only take a few minutes." for FR wikis
  • Run populateRevTimestamp.php --startrev=LAST_REV (this should be quick)
  • Disable $wgReadOnly for FR wikis
  • Scap
#Comment by Aaron Schulz (talk | contribs)   10:29, 23 July 2011

Actually $wgReadOnly should be disabled *after* the scap.

Status & tagging log