r84392 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r84391‎ | r84392 | r84393 >
Date:16:25, 20 March 2011
Author:catrope
Status:ok
Tags:
Comment:
Per r83812 CR, solve the categorymembers paging problem by doing separate queries for each value of cl_type, with some trickery to make paging work. This makes the enum->varchar schema change for cl_type unnecessary, so I'll revert that.

Code was largely copied from Tim's CR comment on r83812 but adapted to deal with the fact that we have to apply the cmcontinue-induced WHERE on cl_sortkey and cl_from only for the first query. Because ApiQueryBase doesn't let us unset or overwrite conditions like these in a nice way, I added an $extraQuery parameter to ApiQueryBase::select() that excepts additional query parameters that are only applied to that query but not stored in the object.
Modified paths:
  • /trunk/phase3/includes/api/ApiQueryBase.php (modified) (history)
  • /trunk/phase3/includes/api/ApiQueryCategoryMembers.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/api/ApiQueryBase.php
@@ -246,14 +246,21 @@
247247 * Execute a SELECT query based on the values in the internal arrays
248248 * @param $method string Function the query should be attributed to.
249249 * You should usually use __METHOD__ here
 250+ * @param $extraQuery array Query data to add but not store in the object
 251+ * Format is array( 'tables' => ..., 'fields' => ..., 'where' => ..., 'options' => ..., 'join_conds' => ... )
250252 * @return ResultWrapper
251253 */
252 - protected function select( $method ) {
 254+ protected function select( $method, $extraQuery = array() ) {
 255+ // Merge $this->tables with $extraQuery['tables'], $this->fields with $extraQuery['fields'], etc.
 256+ foreach ( array( 'tables', 'fields', 'where', 'options', 'join_conds' ) as $var ) {
 257+ $$var = array_merge( $this->{$var}, isset( $extraQuery[$var] ) ? (array)$extraQuery[$var] : array() );
 258+ }
 259+
253260 // getDB has its own profileDBIn/Out calls
254261 $db = $this->getDB();
255262
256263 $this->profileDBIn();
257 - $res = $db->select( $this->tables, $this->fields, $this->where, $method, $this->options, $this->join_conds );
 264+ $res = $db->select( $tables, $fields, $where, $method, $options, $join_conds );
258265 $this->profileDBOut();
259266
260267 return $res;
Index: trunk/phase3/includes/api/ApiQueryCategoryMembers.php
@@ -99,7 +99,8 @@
100100 $this->addTables( array( 'page', 'categorylinks' ) ); // must be in this order for 'USE INDEX'
101101
102102 $this->addWhereFld( 'cl_to', $categoryTitle->getDBkey() );
103 - $this->addWhereFld( 'cl_type', $params['type'] );
 103+ $queryTypes = $params['type'];
 104+ $contWhere = false;
104105
105106 // Scanning large datasets for rare categories sucks, and I already told
106107 // how to have efficient subcategory access :-) ~~~~ (oh well, domas)
@@ -129,20 +130,22 @@
130131 'by the previous query', '_badcontinue'
131132 );
132133 }
133 - $escType = $this->getDB()->addQuotes( $cont[0] );
 134+
 135+ // Remove the types to skip from $queryTypes
 136+ $contTypeIndex = array_search( $cont[0], $queryTypes );
 137+ $queryTypes = array_slice( $queryTypes, $contTypeIndex );
 138+
 139+ // Add a WHERE clause for sortkey and from
134140 $from = intval( $cont[1] );
135141 $escSortkey = $this->getDB()->addQuotes( $cont[2] );
136142 $op = $dir == 'newer' ? '>' : '<';
137 - $this->addWhere( "cl_type $op $escType OR " .
138 - "(cl_type = $escType AND " .
139 - "(cl_sortkey $op $escSortkey OR " .
 143+ // $contWhere is used further down
 144+ $contWhere = "cl_sortkey $op $escSortkey OR " .
140145 "(cl_sortkey = $escSortkey AND " .
141 - "cl_from $op= $from)))"
142 - );
 146+ "cl_from $op= $from)";
143147
144148 } else {
145 - // The below produces ORDER BY cl_type, cl_sortkey, cl_from, possibly with DESC added to each of them
146 - $this->addWhereRange( 'cl_type', $dir, null, null );
 149+ // The below produces ORDER BY cl_sortkey, cl_from, possibly with DESC added to each of them
147150 $this->addWhereRange( 'cl_sortkey',
148151 $dir,
149152 $params['startsortkey'],
@@ -157,9 +160,29 @@
158161 $limit = $params['limit'];
159162 $this->addOption( 'LIMIT', $limit + 1 );
160163
 164+ // Run a separate SELECT query for each value of cl_type.
 165+ // This is needed because cl_type is an enum, and MySQL has
 166+ // inconsistencies between ORDER BY cl_type and
 167+ // WHERE cl_type >= 'foo' making proper paging impossible
 168+ // and unindexed.
 169+ $rows = array();
 170+ $first = true;
 171+ foreach ( $queryTypes as $type ) {
 172+ $extraConds = array( 'cl_type' => $type );
 173+ if ( $first && $contWhere ) {
 174+ // Continuation condition. Only added to the
 175+ // first query, otherwise we'll skip things
 176+ $extraConds[] = $contWhere;
 177+ }
 178+ $res = $this->select( __METHOD__, array( 'where' => $extraConds ) );
 179+ $rows = array_merge( $rows, iterator_to_array( $res ) );
 180+ if ( count( $rows ) >= $limit + 1 ) {
 181+ break;
 182+ }
 183+ $first = false;
 184+ }
161185 $count = 0;
162 - $res = $this->select( __METHOD__ );
163 - foreach ( $res as $row ) {
 186+ foreach ( $rows as $row ) {
164187 if ( ++ $count > $limit ) {
165188 // We've reached the one extra which shows that there are additional pages to be had. Stop here...
166189 // TODO: Security issue - if the user has no right to view next title, it will still be shown

Follow-up revisions

RevisionCommit summaryAuthorDate
r84394Revert r83812 (schema change for cl_type enum), no longer needed after r84392...catrope16:30, 20 March 2011
r84430Followup r84392...reedy22:35, 20 March 2011
r846131.17wmf1: MFT r81692, r82468, r83814, r83885, r83891, r83897, r83902, r83903,...catrope17:42, 23 March 2011
r85434MFT: r83885, r83891, r83897, r83902, r83903, r83934, r83965, r83979, r83988, ...demon13:38, 5 April 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r83812Schema change: change cl_type from ENUM('page', 'subcat', 'file') to varchar(...catrope10:35, 13 March 2011

Status & tagging log