r72308 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r72307‎ | r72308 | r72309 >
Date:20:52, 3 September 2010
Author:simetrical
Status:ok (Comments)
Tags:
Comment:
Further categorylinks schema changes

Per review by Tim, I made two changes:

1) Fix cl_sortkey to be varbinary(255).

2) Expand cl_collation to varbinary(32), and change $wgCollationVersion
to $wgCategoryCollation, to account for the variety of collations we
might have. tinyint is too small. I could have gone with int, but
that's annoyingly inscrutable in practice, as we all know from namespace
fields.

To make the upgrade easier for non-trunk users, I updated the old patch
file to incorporate the new changes, using the updatelog table so that
people upgrading from 1.16 won't have to do two alters on categorylinks.
I didn't test the upgrade-from-1.16 code path yet, so if anyone tests
that and it seems not to break, commenting to that effect would be
appreciated.

Also removed wfDeprecated() from archive(). Do *not* add this to
functions that are still actively used in core. If you think this
function is so terrible that it really mustn't be used, remove callers
yourself, don't pester every single developer with messages in the hope
that someone else will do it for you.
Modified paths:
  • /trunk/phase3/includes/DefaultSettings.php (modified) (history)
  • /trunk/phase3/includes/LinksUpdate.php (modified) (history)
  • /trunk/phase3/includes/installer/MysqlUpdater.php (modified) (history)
  • /trunk/phase3/maintenance/archives/patch-categorylinks-better-collation.sql (modified) (history)
  • /trunk/phase3/maintenance/archives/patch-categorylinks-better-collation2.sql (added) (history)
  • /trunk/phase3/maintenance/tables.sql (modified) (history)
  • /trunk/phase3/maintenance/updateCollation.php (modified) (history)
  • /trunk/phase3/maintenance/updaters.inc (modified) (history)

Diff [purge]

Index: trunk/phase3/maintenance/archives/patch-categorylinks-better-collation2.sql
@@ -0,0 +1,12 @@
 2+--
 3+-- patch-categorylinks-better-collation2.sql
 4+--
 5+-- Bugs 164, 1211, 23682. This patch exists for trunk users who already
 6+-- applied the first patch in its original version. The first patch was
 7+-- updated to incorporate the changes as well, so as not to do two alters on a
 8+-- large table unnecessarily for people upgrading from 1.16, so this will be
 9+-- skipped if unneeded.
 10+ALTER TABLE /*$wgDBprefix*/categorylinks
 11+ CHANGE COLUMN cl_sortkey cl_sortkey varbinary(255) NOT NULL default '',
 12+ CHANGE COLUMN cl_collation cl_collation varbinary(32) NOT NULL default '';
 13+INSERT IGNORE INTO /*$wgDBprefix*/updatelog (ul_key) VALUES ('cl_fields_update');
Property changes on: trunk/phase3/maintenance/archives/patch-categorylinks-better-collation2.sql
___________________________________________________________________
Added: svn:eol-style
114 + native
Index: trunk/phase3/maintenance/archives/patch-categorylinks-better-collation.sql
@@ -1,11 +1,15 @@
22 --
33 -- patch-categorylinks-better-collation.sql
44 --
 5+-- Bugs 164, 1211, 23682. This is the second version of this patch; the
 6+-- changes are also incorporated into patch-categorylinks-better-collation2.sql,
 7+-- for the benefit of trunk users who applied the original.
58 ALTER TABLE /*$wgDBprefix*/categorylinks
 9+ CHANGE COLUMN cl_sortkey cl_sortkey varbinary(255) NOT NULL default '',
610 ADD COLUMN cl_sortkey_prefix varchar(255) binary NOT NULL default '',
7 - ADD COLUMN cl_collation tinyint NOT NULL default 0,
 11+ ADD COLUMN cl_collation varbinary(32) NOT NULL default '',
812 ADD COLUMN cl_type ENUM('page', 'subcat', 'file') NOT NULL default 'page',
913 ADD INDEX (cl_collation),
1014 DROP INDEX cl_sortkey,
1115 ADD INDEX cl_sortkey (cl_to, cl_type, cl_sortkey, cl_from);
 16+INSERT IGNORE INTO /*$wgDBprefix*/updatelog (ul_key) VALUES ('cl_fields_update');
Index: trunk/phase3/maintenance/updaters.inc
@@ -103,7 +103,6 @@
104104 }
105105
106106 function archive( $name ) {
107 - wfDeprecated( __FUNCTION__ );
108107 return DatabaseBase::patchPath( $name );
109108 }
110109
@@ -833,12 +832,23 @@
834833 $task->execute();
835834 }
836835
 836+function do_cl_fields_update() {
 837+ if ( update_row_exists( 'cl_fields_update' ) ) {
 838+ wfOut( "...categorylinks up-to-date.\n" );
 839+ return;
 840+ }
 841+ wfOut( 'Updating categorylinks (again)...' );
 842+ global $wgDatabase;
 843+ $wgDatabase->sourceFile( archive( 'patch-categorylinks-better-collation2.sql' ) );
 844+ wfOut( "done.\n" );
 845+}
 846+
837847 function do_collation_update() {
838 - global $wgDatabase, $wgCollationVersion;
 848+ global $wgDatabase, $wgCategoryCollation;
839849 if ( $wgDatabase->selectField(
840850 'categorylinks',
841851 'COUNT(*)',
842 - 'cl_collation != ' . $wgDatabase->addQuotes( $wgCollationVersion ),
 852+ 'cl_collation != ' . $wgDatabase->addQuotes( $wgCategoryCollation ),
843853 __FUNCTION__
844854 ) == 0 ) {
845855 wfOut( "...collations up-to-date.\n" );
Index: trunk/phase3/maintenance/tables.sql
@@ -493,12 +493,7 @@
494494 -- A binary string obtained by applying a sortkey generation algorithm
495495 -- (Language::convertToSortkey()) to page_title, or cl_sortkey_prefix . "\0"
496496 -- . page_title if cl_sortkey_prefix is nonempty.
497 - --
498 - -- Truncate so that the cl_sortkey key fits in 1000 bytes (MyISAM 5 with
499 - -- server_character_set=utf8). FIXME: this truncation probably makes no
500 - -- sense anymore; we should be using varbinary for this, utf8 will break
501 - -- everything.
502 - cl_sortkey varchar(70) binary NOT NULL default '',
 497+ cl_sortkey varbinary(255) NOT NULL default '',
503498
504499 -- A prefix for the raw sortkey manually specified by the user, either via
505500 -- [[Category:Foo|prefix]] or {{defaultsort:prefix}}. If nonempty, it's
@@ -511,12 +506,12 @@
512507 -- sorting method by approximate addition time.
513508 cl_timestamp timestamp NOT NULL,
514509
515 - -- Stores $wgCollationVersion at the time cl_sortkey was generated. This can
516 - -- be used to install new collation versions, tracking which rows are not yet
517 - -- updated. 0 means no collation, this is a legacy row that needs to be
 510+ -- Stores $wgCategoryCollation at the time cl_sortkey was generated. This
 511+ -- can be used to install new collation versions, tracking which rows are not
 512+ -- yet updated. '' means no collation, this is a legacy row that needs to be
518513 -- updated by updateCollation.php. In the future, it might be possible to
519514 -- specify different collations per category.
520 - cl_collation tinyint NOT NULL default 0,
 515+ cl_collation varbinary(32) NOT NULL default '',
521516
522517 -- Stores whether cl_from is a category, file, or other page, so we can
523518 -- paginate the three categories separately. This never has to be updated
Index: trunk/phase3/maintenance/updateCollation.php
@@ -15,10 +15,10 @@
1616 public function __construct() {
1717 parent::__construct();
1818
19 - global $wgCollationVersion;
 19+ global $wgCategoryCollation;
2020 $this->mDescription = <<<TEXT
2121 This script will find all rows in the categorylinks table whose collation is
22 -out-of-date (cl_collation != $wgCollationVersion) and repopulate cl_sortkey
 22+out-of-date (cl_collation != '$wgCategoryCollation') and repopulate cl_sortkey
2323 using the page title and cl_sortkey_prefix. If everything's collation is
2424 up-to-date, it will do nothing.
2525 TEXT;
@@ -27,13 +27,13 @@
2828 }
2929
3030 public function execute() {
31 - global $wgCollationVersion, $wgContLang;
 31+ global $wgCategoryCollation, $wgContLang;
3232
3333 $dbw = wfGetDB( DB_MASTER );
3434 $count = $dbw->selectField(
3535 'categorylinks',
3636 'COUNT(*)',
37 - 'cl_collation != ' . $dbw->addQuotes( $wgCollationVersion ),
 37+ 'cl_collation != ' . $dbw->addQuotes( $wgCategoryCollation ),
3838 __METHOD__
3939 );
4040
@@ -51,7 +51,7 @@
5252 'cl_sortkey', 'page_namespace', 'page_title'
5353 ),
5454 array(
55 - 'cl_collation != ' . $dbw->addQuotes( $wgCollationVersion ),
 55+ 'cl_collation != ' . $dbw->addQuotes( $wgCategoryCollation ),
5656 'cl_from = page_id'
5757 ),
5858 __METHOD__,
@@ -89,7 +89,7 @@
9090 'cl_sortkey' => $wgContLang->convertToSortkey(
9191 $title->getCategorySortkey( $prefix ) ),
9292 'cl_sortkey_prefix' => $prefix,
93 - 'cl_collation' => $wgCollationVersion,
 93+ 'cl_collation' => $wgCategoryCollation,
9494 'cl_type' => $type,
9595 'cl_timestamp = cl_timestamp',
9696 ),
Index: trunk/phase3/includes/DefaultSettings.php
@@ -4449,10 +4449,13 @@
44504450 /**
44514451 * A version indicator for collations that will be stored in cl_collation for
44524452 * all new rows. Used when the collation algorithm changes: a script checks
4453 - * for all rows where cl_collation != $wgCollationVersion and regenerates
 4453+ * for all rows where cl_collation != $wgCategoryCollation and regenerates
44544454 * cl_sortkey based on the page name and cl_sortkey_prefix.
 4455+ *
 4456+ * Currently only supports 'uppercase', which just uppercases the string. This
 4457+ * is a dummy collation, to be replaced later by real ones.
44554458 */
4456 -$wgCollationVersion = 1;
 4459+$wgCategoryCollation = 'uppercase';
44574460
44584461 /** @} */ # End categories }
44594462
Index: trunk/phase3/includes/installer/MysqlUpdater.php
@@ -165,6 +165,7 @@
166166 array( 'drop_index_if_exists', 'iwlinks', 'iwl_prefix', 'patch-kill-iwl_prefix.sql' ),
167167 array( 'drop_index_if_exists', 'iwlinks', 'iwl_prefix_from_title', 'patch-kill-iwl_pft.sql' ),
168168 array( 'addField', 'categorylinks', 'cl_collation', 'patch-categorylinks-better-collation.sql' ),
 169+ array( 'do_cl_fields_update' ),
169170 array( 'do_collation_update' ),
170171 );
171172 }
Index: trunk/phase3/includes/LinksUpdate.php
@@ -426,7 +426,7 @@
427427 * @private
428428 */
429429 function getCategoryInsertions( $existing = array() ) {
430 - global $wgContLang, $wgCollationVersion;
 430+ global $wgContLang, $wgCategoryCollation;
431431 $diffs = array_diff_assoc( $this->mCategories, $existing );
432432 $arr = array();
433433 foreach ( $diffs as $name => $sortkey ) {
@@ -465,7 +465,7 @@
466466 'cl_sortkey' => $sortkey,
467467 'cl_timestamp' => $this->mDb->timestamp(),
468468 'cl_sortkey_prefix' => $prefix,
469 - 'cl_collation' => $wgCollationVersion,
 469+ 'cl_collation' => $wgCategoryCollation,
470470 'cl_type' => $type,
471471 );
472472 }

Comments

#Comment by MZMcBride (talk | contribs)   21:56, 6 September 2010

The addition of wfDeprecated() to archive() was in r72147.

Status & tagging log