r44591 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r44590‎ | r44591 | r44592 >
Date:20:46, 14 December 2008
Author:catrope
Status:deferred
Tags:
Comment:
Adding AdvancedSearch extension to SVN per request on wikitech-l. Note that this extension is not meant for production use, but rather as a starting point to develop a decent category intersection framework, which means it'll be changed heavily in the time to come (I hope)
Modified paths:
  • /trunk/extensions/AdvancedSearch (added) (history)
  • /trunk/extensions/AdvancedSearch/AdvancedSearch.body.php (added) (history)
  • /trunk/extensions/AdvancedSearch/AdvancedSearch.i18n.php (added) (history)
  • /trunk/extensions/AdvancedSearch/AdvancedSearch.setup.php (added) (history)
  • /trunk/extensions/AdvancedSearch/AdvancedSearchCategoryIntersector.php (added) (history)
  • /trunk/extensions/AdvancedSearch/AdvancedSearchPager.php (added) (history)
  • /trunk/extensions/AdvancedSearch/README (added) (history)
  • /trunk/extensions/AdvancedSearch/categorysearch.sql (added) (history)
  • /trunk/extensions/AdvancedSearch/populateCategorySearch.inc (added) (history)
  • /trunk/extensions/AdvancedSearch/populateCategorySearch.php (added) (history)

Diff [purge]

Index: trunk/extensions/AdvancedSearch/AdvancedSearch.i18n.php
@@ -0,0 +1,62 @@
 2+<?php
 3+/**
 4+ * This program is free software; you can redistribute it and/or modify
 5+ * it under the terms of the GNU General Public License as published by
 6+ * the Free Software Foundation; either version 3 of the License, or
 7+ * (at your option) any later version.
 8+ *
 9+ * @author Roan Kattouw <roan.kattouw@home.nl>
 10+ * @copyright Copyright (C) 2008 Roan Kattouw
 11+ * @license http://www.gnu.org/copyleft/gpl.html GNU General Public License
 12+ *
 13+ * An extension that allows for searching inside categories
 14+ * Written for MixesDB <http://mixesdb.com> by Roan Kattouw <roan.kattouw@home.nl>
 15+ * For information how to install and use this extension, see the README file.
 16+ *
 17+ */
 18+# Alert the user that this is not a valid entry point to MediaWiki if they try to access the extension file directly.
 19+if (!defined('MEDIAWIKI')) {
 20+ echo <<<EOT
 21+To install the AdvancedSearch extension, put the following line in LocalSettings.php:
 22+require_once( "\$IP/extensions/AdvancedSearch/AdvancedSearch.setup.php" );
 23+EOT;
 24+ exit(1);
 25+}
 26+
 27+$messages = array();
 28+
 29+$messages['en'] = array(
 30+ 'advancedsearch' => 'Advanced Search',
 31+ 'advancedsearch-toptext' => 'This is the advanced search, see [[Help:Search]] for more information',
 32+ 'advancedsearch-pagename' => 'AdvancedSearch',
 33+ 'advancedsearch-title' => 'Advanced Search',
 34+ 'advancedsearch-contentsearch' => 'Search in page content',
 35+ 'advancedsearch-searchin' => 'Search in:',
 36+ 'advancedsearch-searchin-title' => 'titles',
 37+ 'advancedsearch-searchin-content' => 'content',
 38+ 'advancedsearch-content-include' => 'List articles that contain:',
 39+ 'advancedsearch-content-exclude' => 'Don\'t list articles that contain:',
 40+ 'advancedsearch-categorysearch' => 'Search in categories',
 41+ 'advancedsearch-category-include' => 'List articles in these categories:',
 42+ 'advancedsearch-category-exclude' => 'Don\'t list articles in these categories:',
 43+ 'advancedsearch-speedcats' => 'Common categories',
 44+ 'advancedsearch-speedcat-dropdown' => 'Choose a category:',
 45+ 'advancedsearch-namespaces' => 'Namespaces',
 46+ 'advancedsearch-selectall' => 'Select all',
 47+ 'advancedsearch-selectnone' => 'Deselect all',
 48+ 'advancedsearch-invertselection' => 'Invert selection',
 49+ 'advancedsearch-submit' => 'Search',
 50+ 'advancedsearch-keyword-and' => 'AND',
 51+ 'advancedsearch-keyword-or' => 'OR',
 52+ 'advancedsearch-permalink' => 'A permanent link to this search is $1.',
 53+ 'advancedsearch-permalink-text' => 'here',
 54+ 'advancedsearch-permalink-check' => 'Get a permanent link to this search',
 55+ 'advancedsearch-permalink-invalid' => 'The permanent link you clicked is invalid',
 56+ 'advancedsearch-parse-error-1' => 'Parse error: unexpected \'$2\': <i>$1 <b>$2</b> $3</i>',
 57+ 'advancedsearch-parse-error-2' => 'Parse error: found \')\' without matching \'(\': <i>$1 <b>$2</b> $3</i>',
 58+ 'advancedsearch-parse-error-3' => 'Parse error: found more \'(\' than \')\'',
 59+ 'advancedsearch-parse-error-4' => 'Parse error: unterminated "',
 60+ 'advancedsearch-parse-error-5' => '"$1" is shorter than four letters. Such short words are not allowed',
 61+ 'advancedsearch-parse-error-6' => '"$1" is a frequently-used word. Such words are not allowed',
 62+ 'advancedsearch-empty-result' => 'No matches were found for your search',
 63+);
Index: trunk/extensions/AdvancedSearch/AdvancedSearch.setup.php
@@ -0,0 +1,63 @@
 2+<?php
 3+/**
 4+ * This program is free software; you can redistribute it and/or modify
 5+ * it under the terms of the GNU General Public License as published by
 6+ * the Free Software Foundation; either version 3 of the License, or
 7+ * (at your option) any later version.
 8+ *
 9+ * @author Roan Kattouw <roan.kattouw@home.nl>
 10+ * @copyright Copyright (C) 2008 Roan Kattouw
 11+ * @license http://www.gnu.org/copyleft/gpl.html GNU General Public License
 12+ *
 13+ * An extension that allows for searching inside categories
 14+ * Written for MixesDB <http://mixesdb.com> by Roan Kattouw <roan.kattouw@home.nl>
 15+ * For information how to install and use this extension, see the README file.
 16+ *
 17+ */
 18+# Alert the user that this is not a valid entry point to MediaWiki if they try to access the extension file directly.
 19+if (!defined('MEDIAWIKI')) {
 20+ echo <<<EOT
 21+To install the AdvancedSearch extension, put the following line in LocalSettings.php:
 22+require_once( "\$IP/extensions/AdvancedSearch/AdvancedSearch.setup.php" );
 23+EOT;
 24+ exit(1);
 25+}
 26+
 27+$wgExtensionCredits['specialpage'][] = array(
 28+ 'name' => 'AdvancedSearch',
 29+ 'author' => 'Roan Kattouw',
 30+ 'url' => 'http://www.mediawiki.org/wiki/Extension:AdvancedSearch',
 31+ 'version' => '1.0',
 32+ 'description' => 'Allows for searching in categories',
 33+ 'descriptionmsg' => 'advancedsearch-desc',
 34+);
 35+
 36+$dir = dirname(__FILE__) . '/';
 37+$wgExtensionMessagesFiles['AdvancedSearch'] = $dir . 'AdvancedSearch.i18n.php';
 38+$wgAutoloadClasses['AdvancedSearch'] = $dir . 'AdvancedSearch.body.php';
 39+$wgAutoloadClasses['AdvancedSearchPager'] = $dir . 'AdvancedSearchPager.php';
 40+$wgAutoloadClasses['AdvancedSearchCategoryIntersector'] = $dir . 'AdvancedSearchCategoryIntersector.php';
 41+
 42+$wgSpecialPages['AdvancedSearch'] = 'AdvancedSearch';
 43+$wgHooks['LanguageGetSpecialPageAliases'][] = 'AdvancedSearchLocalizedPageName';
 44+$wgHooks['LoadExtensionSchemaUpdates'][] = 'AdvancedSearchSchemaUpdate';
 45+$wgHooks['LinksUpdate'][] = 'AdvancedSearchCategoryIntersector::LinksUpdate';
 46+$wgHooks['ArticleDeleteComplete'][] = 'AdvancedSearchCategoryIntersector::ArticleDeleteComplete';
 47+
 48+function AdvancedSearchLocalizedPageName(&$specialPageArray, $code)
 49+{
 50+ wfLoadExtensionMessages('AdvancedSearch');
 51+ $text = wfMsg('advancedsearch-pagename');
 52+
 53+ $title = Title::newFromText($text);
 54+ $specialPageArray['AdvancedSearch'][] = $title->getDBkey();
 55+ return true;
 56+}
 57+
 58+function AdvancedSearchSchemaUpdate()
 59+{
 60+ global $wgExtNewTables;
 61+ $dir = dirname(__FILE__) . '/';
 62+ $wgExtNewTables[] = array('categorysearch', $dir . 'categorysearch.sql');
 63+ return true;
 64+}
Index: trunk/extensions/AdvancedSearch/AdvancedSearchPager.php
@@ -0,0 +1,586 @@
 2+<?php
 3+/**
 4+ * This program is free software; you can redistribute it and/or modify
 5+ * it under the terms of the GNU General Public License as published by
 6+ * the Free Software Foundation; either version 3 of the License, or
 7+ * (at your option) any later version.
 8+ *
 9+ * @author Roan Kattouw <roan.kattouw@home.nl>
 10+ * @copyright Copyright (C) 2008 Roan Kattouw
 11+ * @license http://www.gnu.org/copyleft/gpl.html GNU General Public License
 12+ *
 13+ * An extension that allows for searching inside categories
 14+ * Written for MixesDB <http://mixesdb.com> by Roan Kattouw <roan.kattouw@home.nl>
 15+ * For information how to install and use this extension, see the README file.
 16+ *
 17+ */
 18+# Alert the user that this is not a valid entry point to MediaWiki if they try to access the extension file directly.
 19+if (!defined('MEDIAWIKI')) {
 20+ echo <<<EOT
 21+To install the AdvancedSearch extension, put the following line in LocalSettings.php:
 22+require_once( "\$IP/extensions/AdvancedSearch/AdvancedSearch.setup.php" );
 23+EOT;
 24+ exit(1);
 25+}
 26+
 27+/**
 28+ * Class that pages search results
 29+ *
 30+ * FIXME: Only works with MySQL, not with PostGreSQL or Oracle
 31+ */
 32+class AdvancedSearchPager
 33+{
 34+ // array(array('foo', bar'), array('baz')) means
 35+ // foo AND bar OR baz
 36+ protected $mInclText, $mExclText, $mInclCats, $mExclCats, $mNamespaces;
 37+ protected $mSearchTitle, $mSearchContent, $mKey;
 38+
 39+ /** A list of stop words ignored by the MySQL fulltext search */
 40+ static $stopWords = array(
 41+ 'a\'s', 'able', 'about', 'above', 'according', 'accordingly',
 42+ 'across', 'actually', 'after', 'afterwards', 'again', 'against',
 43+ 'ain\'t', 'all', 'allow', 'allows', 'almost', 'alone', 'along',
 44+ 'already', 'also', 'although', 'always', 'am', 'among',
 45+ 'amongst', 'an', 'and', 'another', 'any', 'anybody', 'anyhow',
 46+ 'anyone', 'anything', 'anyway', 'anyways', 'anywhere', 'apart',
 47+ 'appear', 'appreciate', 'appropriate', 'are', 'aren\'t',
 48+ 'around', 'as', 'aside', 'ask', 'asking', 'associated', 'at',
 49+ 'available', 'away', 'awfully', 'be', 'became', 'because',
 50+ 'become', 'becomes', 'becoming', 'been', 'before', 'beforehand',
 51+ 'behind', 'being', 'believe', 'below', 'beside', 'besides',
 52+ 'best', 'better', 'between', 'beyond', 'both', 'brief', 'but',
 53+ 'by', 'c\'mon', 'c\'s', 'came', 'can', 'can\'t', 'cannot',
 54+ 'cant', 'cause', 'causes', 'certain', 'certainly', 'changes',
 55+ 'clearly', 'co', 'com', 'come', 'comes', 'concerning',
 56+ 'consequently', 'consider', 'considering', 'contain',
 57+ 'containing', 'contains', 'corresponding', 'could',
 58+ 'couldn\'t', 'course', 'currently', 'definitely', 'described',
 59+ 'despite', 'did', 'didn\'t', 'different', 'do', 'does',
 60+ 'doesn\'t', 'doing', 'don\'t', 'done', 'down', 'downwards',
 61+ 'during', 'each', 'edu', 'eg', 'eight', 'either', 'else',
 62+ 'elsewhere', 'enough', 'entirely', 'especially', 'et', 'etc',
 63+ 'even', 'ever', 'every', 'everybody', 'everyone', 'everything',
 64+ 'everywhere', 'ex', 'exactly', 'example', 'except', 'far',
 65+ 'few', 'fifth', 'first', 'five', 'followed', 'following',
 66+ 'follows', 'for', 'former', 'formerly', 'forth', 'four',
 67+ 'from', 'further', 'furthermore', 'get', 'gets', 'getting',
 68+ 'given', 'gives', 'go', 'goes', 'going', 'gone', 'got',
 69+ 'gotten', 'greetings', 'had', 'hadn\'t', 'happens', 'hardly',
 70+ 'has', 'hasn\'t', 'have', 'haven\'t', 'having', 'he', 'he\'s',
 71+ 'hello', 'help', 'hence', 'her', 'here', 'here\'s',
 72+ 'hereafter', 'hereby', 'herein', 'hereupon', 'hers', 'herself',
 73+ 'hi', 'him', 'himself', 'his', 'hither', 'hopefully', 'how',
 74+ 'howbeit', 'however', 'i\'d', 'i\'ll', 'i\'m', 'i\'ve', 'ie',
 75+ 'if', 'ignored', 'immediate', 'in', 'inasmuch', 'inc',
 76+ 'indeed', 'indicate', 'indicated', 'indicates', 'inner',
 77+ 'insofar', 'instead', 'into', 'inward', 'is', 'isn\'t', 'it',
 78+ 'it\'d', 'it\'ll', 'it\'s', 'its', 'itself', 'just', 'keep',
 79+ 'keeps', 'kept', 'know', 'knows', 'known', 'last', 'lately',
 80+ 'later', 'latter', 'latterly', 'least', 'less', 'lest', 'let',
 81+ 'let\'s', 'like', 'liked', 'likely', 'little', 'look',
 82+ 'looking', 'looks', 'ltd', 'mainly', 'many', 'may', 'maybe',
 83+ 'me', 'mean', 'meanwhile', 'merely', 'might', 'more',
 84+ 'moreover', 'most', 'mostly', 'much', 'must', 'my', 'myself',
 85+ 'name', 'namely', 'nd', 'near', 'nearly', 'necessary', 'need',
 86+ 'needs', 'neither', 'never', 'nevertheless', 'new', 'next',
 87+ 'nine', 'no', 'nobody', 'non', 'none', 'noone', 'nor',
 88+ 'normally', 'not', 'nothing', 'novel', 'now', 'nowhere',
 89+ 'obviously', 'of', 'off', 'often', 'oh', 'ok', 'okay', 'old',
 90+ 'on', 'once', 'one', 'ones', 'only', 'onto', 'or', 'other',
 91+ 'others', 'otherwise', 'ought', 'our', 'ours', 'ourselves',
 92+ 'out', 'outside', 'over', 'overall', 'own', 'particular',
 93+ 'particularly', 'per', 'perhaps', 'placed', 'please', 'plus',
 94+ 'possible', 'presumably', 'probably', 'provides', 'que',
 95+ 'quite', 'qv', 'rather', 'rd', 're', 'really', 'reasonably',
 96+ 'regarding', 'regardless', 'regards', 'relatively',
 97+ 'respectively', 'right', 'said', 'same', 'saw', 'say',
 98+ 'saying', 'says', 'second', 'secondly', 'see', 'seeing',
 99+ 'seem', 'seemed', 'seeming', 'seems', 'seen', 'self', 'selves',
 100+ 'sensible', 'sent', 'serious', 'seriously', 'seven', 'several',
 101+ 'shall', 'she', 'should', 'shouldn\'t', 'since', 'six', 'so',
 102+ 'some', 'somebody', 'somehow', 'someone', 'something',
 103+ 'sometime', 'sometimes', 'somewhat', 'somewhere', 'soon',
 104+ 'sorry', 'specified', 'specify', 'specifying', 'still', 'sub',
 105+ 'such', 'sup', 'sure', 't\'s', 'take', 'taken', 'tell',
 106+ 'tends', 'th', 'than', 'thank', 'thanks', 'thanx', 'that',
 107+ 'that\'s', 'thats', 'the', 'their', 'theirs', 'them',
 108+ 'themselves', 'then', 'thence', 'there', 'there\'s',
 109+ 'thereafter', 'thereby', 'therefore', 'therein', 'theres',
 110+ 'thereupon', 'these', 'they', 'they\'d', 'they\'ll',
 111+ 'they\'re', 'they\'ve', 'think', 'third', 'this', 'thorough',
 112+ 'thoroughly', 'those', 'though', 'three', 'through',
 113+ 'throughout', 'thru', 'thus', 'to', 'together', 'too', 'took',
 114+ 'toward', 'towards', 'tried', 'tries', 'truly', 'try',
 115+ 'trying', 'twice', 'two', 'un', 'under', 'unfortunately',
 116+ 'unless', 'unlikely', 'until', 'unto', 'up', 'upon', 'us',
 117+ 'use', 'used', 'useful', 'uses', 'using', 'usually', 'value',
 118+ 'various', 'very', 'via', 'viz', 'vs', 'want', 'wants', 'was',
 119+ 'wasn\'t', 'way', 'we', 'we\'d', 'we\'ll', 'we\'re', 'we\'ve',
 120+ 'welcome', 'well', 'went', 'were', 'weren\'t', 'what',
 121+ 'what\'s', 'whatever', 'when', 'whence', 'whenever', 'where',
 122+ 'where\'s', 'whereafter', 'whereas', 'whereby', 'wherein',
 123+ 'whereupon', 'wherever', 'whether', 'which', 'while',
 124+ 'whither', 'who', 'who\'s', 'whoever', 'whole', 'whom',
 125+ 'whose', 'why', 'will', 'willing', 'wish', 'with', 'within',
 126+ 'without', 'won\'t', 'wonder', 'would', 'would', 'wouldn\'t',
 127+ 'yes', 'yet', 'you', 'you\'d', 'you\'ll', 'you\'re', 'you\'ve',
 128+ 'your', 'yours', 'yourself', 'yourselves', 'zero');
 129+
 130+ /**
 131+ * Constructor
 132+ * @param $incltext string Text from the content-incl form field
 133+ * @param $excltext string Text from the content-excl form field
 134+ * @param $inclcats string Text from the cats-incl form field
 135+ * @param $exclcats string Text from the cats-excl form field
 136+ * @param $speedcats array Array of speedcat categories
 137+ * @param $namespaces array Array of NS_* constants
 138+ * @param $permalink bool If true, cache this query longer
 139+ */
 140+ public function __construct($incltext, $excltext, $inclcats, $exclcats, $speedcats, $dropdown, $namespaces, $searchTitle, $searchContent)
 141+ {
 142+ $this->mInclText = $this->parse($incltext, true);
 143+ $this->mExclText = $this->parse($excltext, true);
 144+
 145+ $sc = implode(' ' . wfMsg('advancedsearch-keyword-or') . ' ', $speedcats);
 146+ $ic = "( $inclcats ) AND ( $sc ) AND ( $dropdown )";
 147+ $this->mInclCats = $this->parse($ic, false);
 148+ $this->mExclCats = $this->parse($exclcats, false);
 149+ $this->mNamespaces = $namespaces;
 150+ $this->mSearchTitle = $searchTitle;
 151+ $this->mSearchContent = $searchContent;
 152+ $this->mKey = md5(implode("\0", array($incltext, $excltext, $ic, $exclcats,
 153+ implode(',', $namespaces),
 154+ $searchTitle ? 1 : 0,
 155+ $searchContent ? 1 : 0)));
 156+
 157+ # Check whether all namespace boxes were checked
 158+ # If so, save some work
 159+ if($this->mNamespaces == array_keys(AdvancedSearch::searchableNamespaces()))
 160+ $this->mNamespaces = array();
 161+ $this->mDb = wfGetDb(DB_SLAVE);
 162+ }
 163+
 164+ public function getSearchTitle()
 165+ {
 166+ return $this->mSearchTitle;
 167+ }
 168+
 169+ public function getSearchContent()
 170+ {
 171+ return $this->mSearchContent;
 172+ }
 173+
 174+ public function cacheQuery()
 175+ {
 176+ global $wgMemc;
 177+ $key = wfMemcKey('advancedsearch', $this->mKey);
 178+ $wgMemc->set($key, $this);
 179+ return $this->mKey;
 180+ }
 181+
 182+ public static function newFromKey($key)
 183+ {
 184+ global $wgMemc;
 185+ $retval = $wgMemc->get(wfMemcKey('advancedsearch', $key));
 186+ if($retval instanceof AdvancedSearchPager)
 187+ $retval->mDb = wfGetDb(DB_SLAVE);
 188+ return $retval;
 189+ }
 190+
 191+ /**
 192+ * Find out whether any errors occurred when parsing the search strings
 193+ * @return array Array of errors, which are either strings or false (no error)
 194+ */
 195+ public function getParseErrors()
 196+ {
 197+ return array(
 198+ (is_string($this->mInclText) ? $this->mInclText : false),
 199+ (is_string($this->mExclText) ? $this->mExclText : false),
 200+ (is_string($this->mInclCats) ? $this->mInclCats : false),
 201+ (is_string($this->mExclCats) ? $this->mExclCats : false)
 202+ );
 203+ }
 204+
 205+ /**
 206+ * Get a value from an array
 207+ * @param $arr array Array to get the value from
 208+ * @param $indices array Array of indices, i.e. array(1,2,3) gets $arr[1][2][3]
 209+ * @return mixed
 210+ */
 211+ public static function getFromArray($arr, $indices)
 212+ {
 213+ $i = array_shift($indices);
 214+ if(empty($indices))
 215+ return @$arr[$i];
 216+ return self::getFromArray(@$arr[$i], $indices);
 217+ }
 218+
 219+ /**
 220+ * Set a value in an array
 221+ * @param $value mixed Value to set
 222+ * @param $arr array Array to work on
 223+ * @param $indices array See getFromArray()
 224+ */
 225+ public static function setInArray($value, &$arr, $indices)
 226+ {
 227+ $i = array_shift($indices);
 228+ if(empty($indices))
 229+ $arr[$i] = $value;
 230+ else
 231+ self::setInArray($value, $arr[$i], $indices);
 232+ }
 233+
 234+ /**
 235+ * Parse text from a form field
 236+ * @param $text string Text from a form field
 237+ * @param $parseSpaces bool If true, parse spaces as ANDs
 238+ * @return mixed Array, or error message (string) if $text is invalid
 239+ */
 240+ protected function parse($text, $parseSpaces)
 241+ {
 242+ $arr = array();
 243+ $boom = explode(' ', $text);
 244+ // Keep track of where we are in $arr
 245+ $indices = array(0, 0);
 246+ $depth = 0;
 247+ $tokens = array(wfMsg('advancedsearch-keyword-and'),
 248+ wfMsg('advancedsearch-keyword-or'),
 249+ '(', ')', '');
 250+ // We can expect two things:
 251+ // 'token': we expect AND, OR, ) or a continuation
 252+ // 'word': we expect ( or text
 253+ $expecting = 'word';
 254+ for($i = 0; $i < count($boom); $i++)
 255+ {
 256+ if($expecting == 'token')
 257+ {
 258+ if($boom[$i] == wfMsg('advancedsearch-keyword-and'))
 259+ {
 260+ // Increment the last index
 261+ end($indices);
 262+ $indices[key($indices)]++;
 263+ $expecting = 'word';
 264+ continue;
 265+ }
 266+ if($boom[$i] == wfMsg('advancedsearch-keyword-or'))
 267+ {
 268+ // Increment the second-to-last index
 269+ // and set the last one to 0
 270+ end($indices);
 271+ $indices[key($indices)] = 0;
 272+ prev($indices);
 273+ $indices[key($indices)]++;
 274+ $expecting = 'word';
 275+ continue;
 276+ }
 277+ // ( is invalid here
 278+ if($boom[$i] == '(')
 279+ return wfMsg('advancedsearch-parse-error-1',
 280+ htmlspecialchars(@$boom[$i - 1]),
 281+ htmlspecialchars($boom[$i]),
 282+ htmlspecialchars(@$boom[$i + 1]));
 283+ // We found a word, so it's a continuation
 284+ if($parseSpaces && $boom[$i] != ')')
 285+ {
 286+ // Increment the last index
 287+ end($indices);
 288+ $indices[key($indices)]++;
 289+ }
 290+ }
 291+ // We're expecting a word
 292+ // Check that it's not a token
 293+ if($boom[$i] == wfMsg('advancedsearch-keyword-and') ||
 294+ $boom[$i] == wfMsg('advancedsearch-keyword-or'))
 295+ return wfMsg('advancedsearch-parse-error-1',
 296+ htmlspecialchars(@$boom[$i - 1]),
 297+ htmlspecialchars($boom[$i]),
 298+ htmlspecialchars(@$boom[$i + 1]));
 299+ // See if it's ( or )
 300+ if($boom[$i] == '(')
 301+ {
 302+ // Go one level deeper
 303+ $indices[] = 0;
 304+ $indices[] = 0;
 305+ $depth++;
 306+ $expecting = 'word';
 307+ continue;
 308+ }
 309+ if($boom[$i] == ')')
 310+ {
 311+ // Go one level down
 312+ $depth--;
 313+ if($depth < 0)
 314+ // More ) than (
 315+ return wfMsg('advancedsearch-parse-error-2',
 316+ htmlspecialchars(@$boom[$i - 1]),
 317+ htmlspecialchars($boom[$i]),
 318+ htmlspecialchars(@$boom[$i + 1]));
 319+ array_pop($indices);
 320+ array_pop($indices);
 321+ $expecting = 'token';
 322+ continue;
 323+ }
 324+ // Parse quotes
 325+ if(substr($boom[$i], 0, 1) == '"')
 326+ {
 327+ $word = $boom[$i];
 328+ while(substr($word, -1) !== '"')
 329+ {
 330+ $i++;
 331+ if(!isset($boom[$i]))
 332+ return wfMsg('advancedsearch-parse-error-4');
 333+ $word .= ' '. $boom[$i];
 334+ }
 335+ # Strip the quotes
 336+ $word = substr($word, 1, -1);
 337+ }
 338+ else
 339+ $word = $boom[$i];
 340+ // Put this word in the array
 341+ $lastAdded = self::getFromArray($arr, $indices);
 342+ if(empty($lastAdded))
 343+ {
 344+ if($parseSpaces || in_array(@$boom[$i + 1], $tokens))
 345+ {
 346+ # We got a single word. Check that it's not too
 347+ # short or a stop word
 348+ if(strlen($word) <= 3 && $word != '')
 349+ return wfMsg('advancedsearch-parse-error-5', $word);
 350+ if(in_array($word, self::$stopWords))
 351+ return wfMsg('advancedsearch-parse-error-6', $word);
 352+ }
 353+ self::setInArray($word, $arr, $indices);
 354+ }
 355+ else
 356+ self::setInArray("$lastAdded $word", $arr, $indices);
 357+ $expecting = 'token';
 358+ }
 359+ // Check if all ( were closed
 360+ if($depth != 0)
 361+ return wfMsg('advancedsearch-parse-error-3');
 362+ return $arr;
 363+ }
 364+
 365+ /**
 366+ * Is an array really empty? Also checks for nested emptiness, e.g.
 367+ * array(array(''))
 368+ * @param $arr array
 369+ * @return bool
 370+ */
 371+ static function isEmpty($arr)
 372+ {
 373+ if(empty($arr))
 374+ return true;
 375+ if(!is_array($arr))
 376+ return false;
 377+ foreach($arr as $a)
 378+ if(!self::isEmpty($a))
 379+ return false;
 380+ return true;
 381+ }
 382+
 383+ /**
 384+ * Generate a MATCH condition
 385+ * @param $arr array $m{Incl,Excl}{Text,Cats}
 386+ * @return string A MATCH condition
 387+ */
 388+ protected function getMatchString($arr)
 389+ {
 390+ $conds = array();
 391+ foreach($arr as $a)
 392+ {
 393+ $subconds = array();
 394+ foreach((array)$a as $b)
 395+ {
 396+ if(is_array($b))
 397+ {
 398+ $m = $this->getMatchString($b);
 399+ if(!empty($m))
 400+ $subconds[] = "+($m)";
 401+ }
 402+ else
 403+ {
 404+ global $wgContLang;
 405+ $s = $wgContLang->stripForSearch($b);
 406+ $s = $this->mDb->strencode($s);
 407+ # If $s contains spaces or ( ) :, quote it
 408+ if(strpos($s, ' ') !== false
 409+ || strpos($s, '(') !== false
 410+ || strpos($s, ')') !== false
 411+ || strpos($s, ':') !== false)
 412+ $s = "\"$s\"";
 413+ if(!empty($s))
 414+ $subconds[] = "+$s";
 415+ }
 416+ }
 417+ $sc = implode(' ', $subconds);
 418+ if(!empty($sc))
 419+ $conds[] = "($sc)";
 420+ }
 421+ return implode(' ', $conds);
 422+ }
 423+
 424+ /**
 425+ * Get the DB key for a category
 426+ * @param $c string Category name
 427+ */
 428+ public static function categoryKey(&$c)
 429+ {
 430+ $t = Title::makeTitleSafe(NS_CATEGORY, $c);
 431+ if(!$t)
 432+ return false;
 433+ $c = $t->getDBkey();
 434+ }
 435+
 436+ public function getQueryInfo()
 437+ {
 438+ $db = $this->mDb;
 439+ $retval = array();
 440+
 441+ $incltext = $db->strencode($this->getMatchString($this->mInclText));
 442+ $excltext = $db->strencode($this->getMatchString($this->mExclText));
 443+ array_walk_recursive($this->mInclCats, array(__CLASS__, 'categoryKey'));
 444+ array_walk_recursive($this->mExclCats, array(__CLASS__, 'categoryKey'));
 445+ $inclcats = $db->strencode($this->getMatchString($this->mInclCats));
 446+ $exclcats = $db->strencode($this->getMatchString($this->mExclCats));
 447+
 448+ $retval['tables'][] = 'page';
 449+ if(!self::isEmpty($this->mInclText) || !self::isEmpty($this->mExclText))
 450+ {
 451+ $retval['tables'][] = 'searchindex';
 452+ $retval['conds'][] = 'page_id=si_page';
 453+ if(!self::isEmpty($this->mInclText))
 454+ {
 455+ $titlecond = $contentcond = $cond = '';
 456+ if($this->mSearchTitle)
 457+ $titlecond = "MATCH (si_title) AGAINST ('$incltext' IN BOOLEAN MODE)";
 458+ if($this->mSearchContent)
 459+ $contentcond = "MATCH (si_text) AGAINST ('$incltext' IN BOOLEAN MODE)";
 460+ if(!empty($titlecond) && !empty($contentcond))
 461+ $cond = "$titlecond OR $contentcond";
 462+ else
 463+ $cond = $titlecond . $contentcond;
 464+ if(!empty($cond))
 465+ $retval['conds'][] = $cond;
 466+ }
 467+ if(!self::isEmpty($this->mExclText))
 468+ {
 469+ $titlecond = $contentcond = $cond = '';
 470+ if($this->mSearchTitle)
 471+ $titlecond = "NOT MATCH (si_title) AGAINST ('$excltext' IN BOOLEAN MODE)";
 472+ if($this->mSearchContent)
 473+ $contentcond = "NOT MATCH (si_text) AGAINST ('$excltext' IN BOOLEAN MODE)";
 474+ if(!empty($titlecond) && !empty($contentcond))
 475+ $cond = "$titlecond OR $contentcond";
 476+ else
 477+ $cond = $titlecond . $contentcond;
 478+ if(!empty($cond))
 479+ $retval['conds'][] = $cond;
 480+ }
 481+ }
 482+ if(!self::isEmpty($this->mInclCats) || !self::isEmpty($this->mExclCats))
 483+ {
 484+ $retval['tables'][] = 'categorysearch';
 485+ $retval['conds'][] = 'page_id=cs_page';
 486+ if(!self::isEmpty($this->mInclCats))
 487+ $retval['conds'][] = "MATCH (cs_categories) AGAINST ('$inclcats' IN BOOLEAN MODE)";
 488+ if(!self::isEmpty($this->mExclCats))
 489+ $retval['conds'][] = "NOT MATCH (cs_categories) AGAINST ('$exclcats' IN BOOLEAN MODE)";
 490+ }
 491+ if(!empty($this->mNamespaces))
 492+ $retval['conds']['page_namespace'] = $this->mNamespaces;
 493+
 494+ $retval['fields'] = array('page_namespace', 'page_title');
 495+ return $retval;
 496+ }
 497+
 498+ public function reallyDoQuery()
 499+ {
 500+ if(isset($this->mResult))
 501+ return $this->mResult;
 502+ global $wgRequest;
 503+ list($this->mLimit, $this->mOffset) = $wgRequest->getLimitOffset(50, 'limit');
 504+ $info = $this->getQueryInfo();
 505+ $tables = $info['tables'];
 506+ $fields = $info['fields'];
 507+ $conds = isset($info['conds']) ? $info['conds'] : array();
 508+ $options = isset($info['options']) ? $info['options'] : array();
 509+ $join_conds = isset($info['join_conds']) ? $info['join_conds'] : array();
 510+ if($this->mOffset != '')
 511+ $options['OFFSET'] = intval($this->mOffset);
 512+ $options['LIMIT'] = intval($this->mLimit);
 513+ $this->mResult = $this->mDb->select($tables, $fields, $conds, __METHOD__, $options, $join_conds);
 514+ return new ResultWrapper($this->mDb, $this->mResult);
 515+ }
 516+
 517+ public function getNumRows()
 518+ {
 519+ if(!isset($this->mResult))
 520+ $this->reallyDoQuery();
 521+ return $this->mResult->numRows();
 522+ }
 523+
 524+ public function preprocessResults($result)
 525+ {
 526+ # Run a LinkBatch query
 527+ $lb = new LinkBatch;
 528+ $result->rewind();
 529+ while(($row = $result->fetchObject()))
 530+ $lb->add($row->page_namespace, $row->page_title);
 531+ $lb->execute();
 532+ $result->rewind();
 533+ }
 534+
 535+ public function getStartBody()
 536+ {
 537+ return Xml::openElement('table') . Xml::openElement('tr');
 538+ }
 539+
 540+ public function getEndBody()
 541+ {
 542+ return Xml::closeElement('tr') . Xml::closeElement('table');
 543+ }
 544+
 545+ public function formatRow($row)
 546+ {
 547+ global $wgUser;
 548+ static $i = 0;
 549+ $open = Xml::openElement('td', array('valign' => 'top')) . Xml::openElement('ul');
 550+ $close = Xml::closeElement('ul') . Xml::closeElement('td');
 551+ $tdb = $tdf = '';
 552+ if($i == 0)
 553+ $tdb = $open;
 554+ else if($i == ceil($this->mLimit / 2))
 555+ $tdb = $close . $open;
 556+ else if($i == $this->mLimit - 1)
 557+ $tdf = $close;
 558+ $i++;
 559+ $title = Title::makeTitle($row->page_namespace, $row->page_title);
 560+ $link = $wgUser->getSkin()->makeLinkObj($title, htmlspecialchars($title->getPrefixedText()));
 561+ return $tdb . Xml::tags('li', null, $link) . $tdf . "\n";
 562+ }
 563+
 564+ protected function getDefaultQuery()
 565+ {
 566+ $retval = $_GET;
 567+ unset($retval['offset']);
 568+ unset($retval['limit']);
 569+ unset($retval['title']);
 570+ return $retval;
 571+ }
 572+
 573+ public function getBody()
 574+ {
 575+ $res = $this->reallyDoQuery();
 576+ $this->preprocessResults($res);
 577+ $prevnext = wfViewPrevNext($this->mOffset, $this->mLimit,
 578+ SpecialPage::getTitleFor('AdvancedSearch'),
 579+ wfArrayToCGI($this->getDefaultQuery()),
 580+ ($this->getNumRows() < $this->mLimit));
 581+ $retval = $this->getStartBody();
 582+ while(($row = $res->fetchObject()))
 583+ $retval .= $this->formatRow($row);
 584+ $retval .= $this->getEndBody();
 585+ return $prevnext . $retval . $prevnext;
 586+ }
 587+}
Index: trunk/extensions/AdvancedSearch/AdvancedSearchCategoryIntersector.php
@@ -0,0 +1,71 @@
 2+<?php
 3+/**
 4+ * This program is free software; you can redistribute it and/or modify
 5+ * it under the terms of the GNU General Public License as published by
 6+ * the Free Software Foundation; either version 3 of the License, or
 7+ * (at your option) any later version.
 8+ *
 9+ * @author Roan Kattouw <roan.kattouw@home.nl>
 10+ * @copyright Copyright (C) 2008 Roan Kattouw
 11+ * @license http://www.gnu.org/copyleft/gpl.html GNU General Public License
 12+ *
 13+ * An extension that allows for searching inside categories
 14+ * Written for MixesDB <http://mixesdb.com> by Roan Kattouw <roan.kattouw@home.nl>
 15+ * For information how to install and use this extension, see the README file.
 16+ *
 17+ */
 18+# Alert the user that this is not a valid entry point to MediaWiki if they try to access the extension file directly.
 19+if (!defined('MEDIAWIKI')) {
 20+ echo <<<EOT
 21+To install the AdvancedSearch extension, put the following line in LocalSettings.php:
 22+require_once( "\$IP/extensions/AdvancedSearch/AdvancedSearch.setup.php" );
 23+EOT;
 24+ exit(1);
 25+}
 26+
 27+/**
 28+ * A class that does category intersections using the categorysearch table
 29+ */
 30+class AdvancedSearchCategoryIntersector
 31+{
 32+ /**
 33+ * Update the categorysearch table
 34+ * @param $pageid int Page ID
 35+ * @param $categories array Array of categories (DB keys)
 36+ */
 37+ static function update($pageid, $categories)
 38+ {
 39+ global $wgContLang;
 40+ $ctext = $wgContLang->stripForSearch(implode(' ', $categories));
 41+ $dbw = wfGetDb(DB_MASTER);
 42+ $dbw->replace('categorysearch', 'cs_page',
 43+ array('cs_page' => $pageid, 'cs_categories' => $ctext),
 44+ __METHOD__);
 45+ }
 46+
 47+ /**
 48+ * Remove the entry for a page
 49+ * @param $pageid int Page ID
 50+ */
 51+ static function remove($pageid)
 52+ {
 53+ $dbw = wfGetDb(DB_MASTER);
 54+ $dbw->delete('categorysearch', array('cs_page' => $pageid), __METHOD__);
 55+ }
 56+
 57+ /**
 58+ * Hook function for 'LinksUpdate' hook
 59+ * @param $lu LinksUpdate
 60+ */
 61+ static function LinksUpdate(&$lu)
 62+ {
 63+ self::update($lu->mId, array_keys($lu->mCategories));
 64+ return true;
 65+ }
 66+
 67+ static function ArticleDeleteComplete(&$article, &$user, $reason, $id)
 68+ {
 69+ self::remove($article->getID());
 70+ return true;
 71+ }
 72+}
Index: trunk/extensions/AdvancedSearch/categorysearch.sql
@@ -0,0 +1,18 @@
 2+--
 3+-- Table used to do category intersections
 4+--
 5+-- This table must be MyISAM; InnoDB does not support the needed
 6+-- fulltext index.
 7+--
 8+CREATE TABLE /*$wgDBprefix*/categorysearch (
 9+ -- Key to page_id
 10+ cs_page int unsigned NOT NULL,
 11+
 12+ -- Munged version of categories
 13+ -- E.g.: "Foo Living_people Bar"
 14+ cs_categories mediumtext NOT NULL,
 15+
 16+ UNIQUE KEY (cs_page),
 17+ FULLTEXT cs_categories (cs_categories)
 18+
 19+) TYPE=MyISAM;
Index: trunk/extensions/AdvancedSearch/populateCategorySearch.php
@@ -0,0 +1,56 @@
 2+<?php
 3+/**
 4+ * @addtogroup Maintenance
 5+ * @author Roan Kattouw
 6+ */
 7+
 8+$optionsWithArgs = array( 'begin', 'max-slave-lag', 'throttle' );
 9+
 10+$commandLineInc = dirname(__FILE__) . "/../../maintenance/commandLine.inc";
 11+if(!file_exists($commandLineInc))
 12+ die("Can't find commandLine.inc\nPlease copy it to " .
 13+ realpath(dirname(__FILE__) . "/../../") . "maintenance or make a symlink.");
 14+require_once $commandLineInc;
 15+require_once "populateCategorySearch.inc";
 16+
 17+if( isset( $options['help'] ) ) {
 18+ echo <<<TEXT
 19+This script will populate the categorysearch table, added by the
 20+CategorySearch extension. It will print out progress indicators every
 21+1000 pages it adds to the table. The script is perfectly safe to run on large,
 22+live wikis, and running it multiple times is harmless. You may want to use the
 23+throttling options if it's causing too much load; they will not affect
 24+correctness.
 25+
 26+If the script is stopped and later resumed, you can use the --begin option with
 27+the last printed progress indicator to pick up where you left off. This is
 28+safe, because any newly-added intersections before this cutoff will have been
 29+added after the software update and so will be populated anyway.
 30+
 31+When the script has finished, it will make a note of this in the database, and
 32+will not run again without the --force option.
 33+
 34+Usage:
 35+ php populateCategorySearch.php [--max-slave-lag <seconds>]
 36+[--begin <pageID>] [--throttle <seconds>] [--force]
 37+
 38+ --begin: Only do pages with page IDs higher than this value.
 39+Default: empty (start from beginning).
 40+ --max-slave-lag: If slave lag exceeds this many seconds, wait until it
 41+drops before continuing. Default: 10.
 42+ --throttle: Wait this many milliseconds after each page. Default: 0.
 43+ --force: Run regardless of whether the database says it's been run already.
 44+TEXT;
 45+ exit( 0 );
 46+}
 47+
 48+$defaults = array(
 49+ 'begin' => '',
 50+ 'max-slave-lag' => 10,
 51+ 'throttle' => 0,
 52+ 'force' => false
 53+);
 54+$options = array_merge( $defaults, $options );
 55+
 56+populateCategorySearch( $options['begin'], $options['max-slave-lag'],
 57+ $options['throttle'], $options['force'] );
Index: trunk/extensions/AdvancedSearch/populateCategorySearch.inc
@@ -0,0 +1,86 @@
 2+<?php
 3+/**
 4+ * @addtogroup Maintenance
 5+ * @author Roan Kattouw
 6+ */
 7+
 8+define( 'REPORTING_INTERVAL', 1000 );
 9+
 10+function populateCategorySearch( $begin, $maxlag, $throttle, $force ) {
 11+ $dbw = wfGetDB( DB_MASTER );
 12+
 13+ if( !$force ) {
 14+ $row = $dbw->selectRow(
 15+ 'updatelog',
 16+ '1',
 17+ array( 'ul_key' => 'populate categorysearch' ),
 18+ __FUNCTION__
 19+ );
 20+ if( $row ) {
 21+ echo "Categorysearch table already populated. Use php ".
 22+ "maintenance/populateCategorySearch.php\n--force from the command line ".
 23+ "to override.\n";
 24+ return true;
 25+ }
 26+ }
 27+
 28+ $maxlag = intval( $maxlag );
 29+ $throttle = intval( $throttle );
 30+ $force = (bool)$force;
 31+ if( $begin !== '' ) {
 32+ $where = 'page_id > '.$dbw->addQuotes( $begin );
 33+ } else {
 34+ $where = null;
 35+ }
 36+ $i = 0;
 37+
 38+ while(true) {
 39+ # Get the next page ID
 40+ $row = $dbw->selectRow(
 41+ 'page',
 42+ 'page_id',
 43+ $where,
 44+ __FUNCTION__,
 45+ array('ORDER BY' => 'page_id')
 46+ );
 47+ if(!$row)
 48+ # We're done
 49+ break;
 50+ $pageid = intval($row->page_id);
 51+ $where = 'page_id > ' . $pageid;
 52+
 53+ # Get all categories this page is in
 54+ $res = $dbw->select(
 55+ 'categorylinks',
 56+ 'cl_to',
 57+ array('cl_from' => $pageid),
 58+ __FUNCTION__
 59+ );
 60+ $categories = array();
 61+ while(($row = $dbw->fetchObject($res)))
 62+ $categories[] = $row->cl_to;
 63+ $ctext = implode(' ', $categories);
 64+ AdvancedSearchCategoryIntersector::update($pageid, $categories);
 65+ $i++;
 66+ if(!($i % REPORTING_INTERVAL))
 67+ {
 68+ echo "$pageid\n";
 69+ wfWaitForSlaves($maxlag);
 70+ }
 71+ usleep($throttle*1000);
 72+ }
 73+
 74+ if( $dbw->insert(
 75+ 'updatelog',
 76+ array( 'ul_key' => 'populate categorysearch' ),
 77+ __FUNCTION__,
 78+ 'IGNORE'
 79+ )
 80+ ) {
 81+ echo "Categorysearch population complete.\n";
 82+ return true;
 83+ } else {
 84+ echo "Could not insert categorysearch population row.\n";
 85+ return false;
 86+ }
 87+}
Index: trunk/extensions/AdvancedSearch/AdvancedSearch.body.php
@@ -0,0 +1,459 @@
 2+<?php
 3+/**
 4+ * This program is free software; you can redistribute it and/or modify
 5+ * it under the terms of the GNU General Public License as published by
 6+ * the Free Software Foundation; either version 3 of the License, or
 7+ * (at your option) any later version.
 8+ *
 9+ * @author Roan Kattouw <roan.kattouw@home.nl>
 10+ * @copyright Copyright (C) 2008 Roan Kattouw
 11+ * @license http://www.gnu.org/copyleft/gpl.html GNU General Public License
 12+ *
 13+ * An extension that allows for searching inside categories
 14+ * Written for MixesDB <http://mixesdb.com> by Roan Kattouw <roan.kattouw@home.nl>
 15+ * For information how to install and use this extension, see the README file.
 16+ *
 17+ */
 18+# Alert the user that this is not a valid entry point to MediaWiki if they try to access the extension file directly.
 19+if (!defined('MEDIAWIKI')) {
 20+ echo <<<EOT
 21+To install the AdvancedSearch extension, put the following line in LocalSettings.php:
 22+require_once( "\$IP/extensions/AdvancedSearch/AdvancedSearch.setup.php" );
 23+EOT;
 24+ exit(1);
 25+}
 26+
 27+class AdvancedSearch extends SpecialPage
 28+{
 29+ public function __construct()
 30+ {
 31+ parent::__construct('AdvancedSearch');
 32+ }
 33+
 34+ public function execute($par)
 35+ {
 36+ global $wgOut, $wgRequest;
 37+ wfLoadExtensionMessages('AdvancedSearch');
 38+ $this->setHeaders();
 39+ $wgOut->setPageTitle(wfMsg('advancedsearch-title'));
 40+ if($wgRequest->getVal('do') == 'search' || !is_null($par))
 41+ $wgOut->addHTML($this->showResults($par));
 42+ else
 43+ $wgOut->addHTML($this->buildForm());
 44+ }
 45+
 46+ /**
 47+ * Generate the HTML for the permanent link to this search result
 48+ * @param $pager AdvancedSearchPager
 49+ */
 50+ protected function permaLink($pager)
 51+ {
 52+ global $wgUser;
 53+ $key = $pager->cacheQuery();
 54+ return wfMsg('advancedsearch-permalink',
 55+ $wgUser->getSkin()->makeLinkObj(
 56+ SpecialPage::getTitleFor("AdvancedSearch/$key"),
 57+ wfMsg('advancedsearch-permalink-text')));
 58+ }
 59+
 60+ protected function showResults($par)
 61+ {
 62+ global $wgRequest;
 63+ $key = $wgRequest->getVal('key', null);
 64+ if(is_null($key))
 65+ $key = $par;
 66+ $searchTitle = $searchContent = true;
 67+ if(!is_null($key))
 68+ {
 69+ $pager = AdvancedSearchPager::newFromKey($key);
 70+ if($pager instanceof AdvancedSearchPager)
 71+ {
 72+ $searchTitle = $pager->getSearchTitle();
 73+ $searchContent = $pager->getSearchContent();
 74+ }
 75+ }
 76+ else
 77+ {
 78+ $searchTitle = $wgRequest->getCheck('searchtitle');
 79+ $searchContent = $wgRequest->getCheck('searchcontent');
 80+ $pager = new AdvancedSearchPager(
 81+ $wgRequest->getVal('content-incl'),
 82+ $wgRequest->getVal('content-excl'),
 83+ $wgRequest->getVal('cat-incl'),
 84+ $wgRequest->getVal('cat-excl'),
 85+ $wgRequest->getArray('speedcats', array()),
 86+ $wgRequest->getVal('scdd'),
 87+ $wgRequest->getArray('namespaces', array()),
 88+ $searchTitle,
 89+ $searchContent);
 90+ }
 91+ $permalink = $body = '';
 92+ $errors = array(false, false, false, false);
 93+ if(!$pager instanceof AdvancedSearchPager)
 94+ {
 95+ $body = Xml::element('div', array('class' => 'errorbox'),
 96+ wfMsg('advancedsearch-permalink-invalid'));
 97+ }
 98+ else
 99+ {
 100+ $errors = $pager->getParseErrors();
 101+ if($errors !== array(false, false, false, false))
 102+ return $this->buildForm($errors, $searchTitle, $searchContent);
 103+
 104+ if($wgRequest->getBool('permalink'))
 105+ $permalink = $this->permaLink($pager);
 106+ if($pager->getNumRows() > 0)
 107+ $body = $pager->getBody();
 108+ else
 109+ $body = Xml::element('div', array('class' => 'errorbox'),
 110+ wfMsg('advancedsearch-empty-result'));
 111+ }
 112+ return $body .
 113+ Xml::element('br', array('clear' => 'both')) .
 114+ $permalink .
 115+ $this->buildForm($errors, $searchTitle, $searchContent);
 116+ }
 117+
 118+ protected function inputRow($name)
 119+ {
 120+ global $wgRequest;
 121+ return Xml::openElement('tr') .
 122+ Xml::openElement('td') .
 123+ Xml::input($name, 50, $wgRequest->getVal($name, false)) .
 124+ Xml::closeElement('td') .
 125+ Xml::closeElement('tr');
 126+ }
 127+
 128+ protected function errorRow($msg)
 129+ {
 130+ return Xml::openElement('tr') .
 131+ Xml::openElement('td') .
 132+ Xml::openElement('span', array('class' => 'error')) .
 133+ $msg .
 134+ Xml::closeElement('span') .
 135+ Xml::closeElement('td') .
 136+ Xml::closeElement('tr');
 137+ }
 138+
 139+ protected function speedCatTable()
 140+ {
 141+ global $wgAdvancedSearchSpeedCats, $wgRequest;
 142+ $i = $j = $n = 0;
 143+ $cols = 3;
 144+ $scarr = $wgRequest->getArray('speedcats', array());
 145+ $retval = Xml::openElement('table');
 146+ foreach(@$wgAdvancedSearchSpeedCats as $name => $display)
 147+ {
 148+ $close = false;
 149+ if($i == 0)
 150+ $retval .= Xml::openElement('tr');
 151+ if($i == $cols - 1)
 152+ {
 153+ $i = 0;
 154+ $j++;
 155+ $close = true;
 156+ }
 157+ else
 158+ $i++;
 159+ $n++;
 160+
 161+ $retval .= Xml::openElement('td');
 162+ $checked = false;
 163+ if(in_array($name, $scarr))
 164+ $checked = true;
 165+ $retval .= Xml::checkLabel($display, 'speedcats[]', "speedcats-$n", $checked, array('value' => $name));
 166+ $retval .= Xml::closeElement('td');
 167+ if($close)
 168+ $retval .= Xml::closeElement('tr');
 169+ }
 170+ if(!$close)
 171+ $retval .= Xml::closeElement('tr');
 172+ $retval .= Xml::openElement('tr');
 173+ $retval .= Xml::openElement('td', array('colspan' => 2));
 174+ $retval .= Xml::element('a', array('href' => 'javascript:caSpeedcats(\'all\');'), wfMsg('advancedsearch-selectall'));
 175+ $retval .= ' / ';
 176+ $retval .= Xml::element('a', array('href' => 'javascript:caSpeedcats(\'none\');'), wfMsg('advancedsearch-selectnone'));
 177+ $retval .= ' / ';
 178+ $retval .= Xml::element('a', array('href' => 'javascript:caSpeedcats(\'invert\');'), wfMsg('advancedsearch-invertselection'));
 179+ $retval .= Xml::closeElement('td');
 180+ $retval .= Xml::closeElement('tr');
 181+ $retval .= $this->speedcatDropdownRows();
 182+ $retval .= Xml::closeElement('table');
 183+ return $retval;
 184+ }
 185+
 186+ protected function speedcatCheckboxes()
 187+ {
 188+ global $wgAdvancedSearchSpeedCats;
 189+ if(!isset($wgAdvancedSearchSpeedCats) || empty($wgAdvancedSearchSpeedCats))
 190+ return array();
 191+ $retval = array();
 192+ $i = 1;
 193+ foreach(@$wgAdvancedSearchSpeedCats as $name => $display)
 194+ {
 195+ $retval[] = "speedcats-{$i}";
 196+ $i++;
 197+ }
 198+ return $retval;
 199+ }
 200+
 201+ protected function speedcatDropdownRows()
 202+ {
 203+ global $wgAdvancedSearchSpeedCatDropdown, $wgRequest;
 204+ if(!isset($wgAdvancedSearchSpeedCatDropdown) || empty($wgAdvancedSearchSpeedCatDropdown))
 205+ return '';
 206+ $sel = $wgRequest->getVal('scdd');
 207+ $retval = Xml::openElement('tr');
 208+ $retval .= Xml::openElement('td', array('colspan' => 3));
 209+ $retval .= wfMsg('advancedsearch-speedcat-dropdown');
 210+ $retval .= Xml::openElement('select', array('name' => 'scdd'));
 211+ $retval .= Xml::option('', '', is_null($sel));
 212+ foreach(@$wgAdvancedSearchSpeedCatDropdown as $key => $value)
 213+ {
 214+ if(is_int($key))
 215+ $key = $value;
 216+ $retval .= Xml::option($key, $value, $sel == $value);
 217+ }
 218+ $retval .= Xml::closeElement('select');
 219+ $retval .= Xml::closeElement('td');
 220+ $retval .= Xml::closeElement('tr');
 221+ return $retval;
 222+ }
 223+
 224+ public static function searchableNamespaces()
 225+ {
 226+ global $wgContLang;
 227+ $retval = array();
 228+ foreach($wgContLang->getFormattedNamespaces() as $ns => $value)
 229+ if($ns >= NS_MAIN)
 230+ $retval[$ns] = $value;
 231+ return $retval;
 232+ }
 233+
 234+ protected function namespaceTable()
 235+ {
 236+ global $wgRequest, $wgUser;
 237+ $i = 0;
 238+ $j = 0;
 239+ $cols = 2;
 240+ $retval = Xml::openElement('table');
 241+ $nsarr = $wgRequest->getArray('namespaces', array());
 242+ foreach(self::searchableNamespaces() as $ns => $display)
 243+ {
 244+ $close = false;
 245+ if($i == 0)
 246+ $retval .= Xml::openElement('tr');
 247+ if($i == $cols - 1)
 248+ {
 249+ $i = 0;
 250+ $j++;
 251+ $close = true;
 252+ }
 253+ else
 254+ $i++;
 255+ $retval .= Xml::openElement('td');
 256+ if($display == '')
 257+ $display = wfMsg('blanknamespace');
 258+ $checked = false;
 259+ if(in_array($ns, $nsarr))
 260+ $checked = true;
 261+ else if(empty($nsarr))
 262+ $checked = $wgUser->getOption("searchNs$ns");
 263+ $retval .= Xml::checkLabel($display, 'namespaces[]', "namespaces-$ns",
 264+ $checked, array('value' => $ns));
 265+ $retval .= Xml::closeElement('td');
 266+ if($close)
 267+ $retval .= Xml::closeElement('tr');
 268+ }
 269+ if(!$close)
 270+ $retval .= Xml::closeElement('tr');
 271+ $retval .= Xml::openElement('tr');
 272+ $retval .= Xml::openElement('td', array('colspan' => 2));
 273+ $retval .= Xml::element('a', array('href' => 'javascript:caNamespaces(\'all\');'), wfMsg('advancedsearch-selectall'));
 274+ $retval .= ' / ';
 275+ $retval .= Xml::element('a', array('href' => 'javascript:caNamespaces(\'none\');'), wfMsg('advancedsearch-selectnone'));
 276+ $retval .= ' / ';
 277+ $retval .= Xml::element('a', array('href' => 'javascript:caNamespaces(\'invert\');'), wfMsg('advancedsearch-invertselection'));
 278+ $retval .= Xml::closeElement('td');
 279+ $retval .= Xml::closeElement('tr');
 280+
 281+ $retval .= Xml::closeElement('table');
 282+ return $retval;
 283+ }
 284+
 285+ protected function namespaceCheckboxes()
 286+ {
 287+ $retval = array();
 288+ foreach(self::searchableNamespaces() as $ns => $unused)
 289+ $retval[] = "namespaces-$ns";
 290+ return $retval;
 291+ }
 292+
 293+ protected function invertJS($func, $checkboxes)
 294+ {
 295+ $retval = "function $func(action)\n{";
 296+ foreach($checkboxes as $c)
 297+ $retval .= "checkboxAction('$c', action);\n";
 298+ $retval .= "}\n";
 299+ return $retval;
 300+ }
 301+
 302+ protected function checkboxActionJS()
 303+ {
 304+ return <<<ENDOFLINE
 305+function checkboxAction(c, action)
 306+{
 307+ var obj = document.getElementById(c);
 308+ switch(action)
 309+ {
 310+ case 'all':
 311+ obj.checked = true;
 312+ break;
 313+ case 'none':
 314+ obj.checked = false;
 315+ break;
 316+ case 'invert':
 317+ obj.checked = !obj.checked;
 318+ }
 319+}
 320+ENDOFLINE;
 321+ }
 322+
 323+ protected function buildForm($parseErrors = null, $searchTitle = true, $searchContent = true)
 324+ {
 325+ global $wgScript, $wgOut;
 326+ $wgOut->addInlineScript(
 327+ $this->checkboxActionJS() .
 328+ $this->invertJS('caNamespaces', $this->namespaceCheckboxes()) .
 329+ $this->invertJS('caSpeedcats', $this->speedcatCheckboxes()));
 330+ $retval = wfMsgExt('advancedsearch-toptext', array('parse'));
 331+ $retval .= Xml::openElement('form', array('method' => 'GET', 'action' => $wgScript));
 332+ $retval .= Xml::hidden('title', $this->getTitle()->getPrefixedDbKey());
 333+ $retval .= Xml::hidden('do', 'search');
 334+
 335+ // The big table everything is in
 336+ $retval .= Xml::openElement('table');
 337+
 338+ // The fieldset+table for searching page content
 339+ $retval .= Xml::openElement('tr');
 340+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 341+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 342+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-contentsearch'));
 343+ $retval .= Xml::openElement('table');
 344+
 345+ // title/content checkboxes
 346+ $retval .= Xml::openElement('tr');
 347+ $retval .= Xml::openElement('td');
 348+ $retval .= wfMsg('advancedsearch-searchin');
 349+ $retval .= Xml::checkLabel(wfMsg('advancedsearch-searchin-title'), 'searchtitle',
 350+ 'searchtitle', $searchTitle);
 351+ $retval .= Xml::checkLabel(wfMsg('advancedsearch-searchin-content'), 'searchcontent',
 352+ 'searchcontent', $searchContent);
 353+ $retval .= Xml::closeElement('td');
 354+ $retval .= Xml::closeElement('tr');
 355+
 356+ // Include fieldset
 357+ $retval .= Xml::openElement('tr');
 358+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 359+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 360+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-content-include'));
 361+ $retval .= Xml::openElement('table');
 362+ $retval .= $this->inputRow('content-incl');
 363+ if(is_array($parseErrors) && $parseErrors[0] !== false)
 364+ $retval .= $this->errorRow($parseErrors[0]);
 365+ $retval .= Xml::closeElement('table');
 366+ $retval .= Xml::closeElement('fieldset');
 367+ $retval .= Xml::closeElement('td');
 368+ $retval .= Xml::closeElement('tr');
 369+
 370+ // Exclude fieldset
 371+ $retval .= Xml::openElement('tr');
 372+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 373+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 374+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-content-exclude'));
 375+ $retval .= Xml::openElement('table');
 376+ $retval .= $this->inputRow('content-excl');
 377+ if(is_array($parseErrors) && $parseErrors[1] !== false)
 378+ $retval .= $this->errorRow($parseErrors[1]);
 379+ $retval .= Xml::closeElement('table');
 380+ $retval .= Xml::closeElement('fieldset');
 381+ $retval .= Xml::closeElement('td');
 382+ $retval .= Xml::closeElement('tr');
 383+
 384+ $retval .= Xml::closeElement('table');
 385+ $retval .= Xml::closeElement('fieldset');
 386+ $retval .= Xml::closeElement('td');
 387+
 388+ // The namespace fieldset
 389+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 390+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 391+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-namespaces'));
 392+ $retval .= $this->namespaceTable();
 393+ $retval .= Xml::closeElement('fieldset');
 394+ $retval .= Xml::closeElement('td');
 395+ $retval .= Xml::closeElement('tr');
 396+
 397+ // The category fieldset
 398+ $retval .= Xml::openElement('tr');
 399+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 400+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 401+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-categorysearch'));
 402+ $retval .= Xml::openElement('table');
 403+
 404+ // The include fieldset
 405+ $retval .= Xml::openElement('tr');
 406+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 407+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 408+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-category-include'));
 409+ $retval .= Xml::openElement('table');
 410+ $retval .= $this->inputRow('cat-incl');
 411+ if(is_array($parseErrors) && $parseErrors[2] !== false)
 412+ $retval .= $this->errorRow($parseErrors[2]);
 413+ $retval .= Xml::closeElement('table');
 414+ $retval .= Xml::closeElement('fieldset');
 415+ $retval .= Xml::closeElement('td');
 416+ $retval .= Xml::closeElement('tr');
 417+
 418+ // The exclude fieldset
 419+ $retval .= Xml::openElement('tr');
 420+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 421+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 422+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-category-exclude'));
 423+ $retval .= Xml::openElement('table');
 424+ $retval .= $this->inputRow('cat-excl');
 425+ if(is_array($parseErrors) && $parseErrors[3] !== false)
 426+ $retval .= $this->errorRow($parseErrors[3]);
 427+ $retval .= Xml::closeElement('table');
 428+ $retval .= Xml::closeElement('fieldset');
 429+ $retval .= Xml::closeElement('td');
 430+ $retval .= Xml::closeElement('tr');
 431+
 432+ $retval .= Xml::closeElement('table');
 433+ $retval .= Xml::closeElement('fieldset');
 434+ $retval .= Xml::closeElement('td');
 435+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 436+ $retval .= Xml::openElement('table');
 437+
 438+ // The speedcat fieldset
 439+ global $wgAdvancedSearchSpeedCats;
 440+ if(!empty($wgAdvancedSearchSpeedCats))
 441+ {
 442+ $retval .= Xml::openElement('td', array('valign' => 'top'));
 443+ $retval .= Xml::openElement('fieldset', array('class' => 'nested'));
 444+ $retval .= Xml::element('legend', array('class' => 'advancedsearchLegend'), wfMsg('advancedsearch-speedcats'));
 445+ $retval .= $this->speedCatTable();
 446+ $retval .= Xml::closeElement('fieldset');
 447+ $retval .= Xml::closeElement('td');
 448+ }
 449+
 450+ $retval .= Xml::closeElement('table');
 451+ $retval .= Xml::closeElement('td');
 452+ $retval .= Xml::closeElement('tr');
 453+ $retval .= Xml::closeElement('table');
 454+ $retval .= Xml::checkLabel(wfMsg('advancedsearch-permalink-check'), 'permalink', 'permalink');
 455+ $retval .= Xml::element('br');
 456+ $retval .= Xml::submitButton(wfMsg('advancedsearch-submit'));
 457+ $retval .= Xml::closeElement('form');
 458+ return $retval;
 459+ }
 460+}
Index: trunk/extensions/AdvancedSearch/README
@@ -0,0 +1,80 @@
 2+NOTE: This extension is UNDER DEVELOPMENT and NOT READY FOR PRODUCTION USE
 3+
 4+
 5+ADVANCEDSEARCH EXTENSION README
 6+
 7+TABLE OF CONTENTS
 8+1. Introduction
 9+2. Where to get AdvancedSearch
 10+3. Installation
 11+3A. Building the index
 12+4. Using Special:AdvancedSearch
 13+5. Customizing Special:AdvancedSearch
 14+5A. Common categories (checkboxes)
 15+5B. Common categories (dropdown)
 16+5C. Changing interface messages
 17+6. Licensing
 18+7. Translating AdvancedSearch
 19+8. Contact
 20+9. Credits
 21+
 22+1. INTRODUCTION
 23+This extension adds a special page which allows for searching in a more advanced way than Special:Search does. You can use complex AND/OR expressions, search for pages that don't match an expression, and search for pages that are or aren't in certain categories.
 24+
 25+2. WHERE TO GET ADVANCEDSEARCH
 26+You can download a tarball at http://www.mediawiki.org/wiki/Special:ExtensionDistributor/AdvancedSearch . If that link doesn't work yet, go to http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/AdvancedSearch and download all files in that directory. To download a file, simply click on it, then click the (download) link on top.
 27+
 28+3. INSTALLATION
 29+If you downloaded the tarball, extract it in the /path/to/your/wiki/extensions directory. If you downloaded the individual files, create the /path/to/your/wiki/extensions/AdvancedSearch directory and put the files you downloaded in there.
 30+
 31+Open LocalSettings.php and add the following line at the end:
 32+
 33+require_once("$IP/extensions/AdvancedSearch/AdvancedSearch.setup.php");
 34+
 35+Finally, execute the following commands on the command line:
 36+
 37+cd /path/to/your/wiki/maintenance
 38+php update.php
 39+
 40+3A. BUILDING THE INDEX
 41+For the category search feature to work properly, the category index must be built. The index will gradually build itself as pages are edited, but if you want the category search to work right from the start, you have to build the index manually. You only need to do this once, as the index will keep itself up to date. To build the index, execute the following commands on the command line:
 42+
 43+cd /path/to/your/wiki/extensions/AdvancedSearch
 44+php populateCategorySearch.php
 45+
 46+Building the index may take a long time on large wikis, and the wiki may slow down significantly while the index is being built. If this is the case, abort the script by pressing Ctrl+C and run "php populateCategorySearch.php --help" (without the quotes) to read about throttling options that can help you to build the index at a slower pace and keep the wiki usable.
 47+
 48+4. USING SPECIAL:ADVANCEDSEARCH
 49+To use the AdvancedSearch, go to the Special:AdvancedSearch special page. In the "Search in page content" text boxes, you can enter a list of words that should (or shouldn't) occur in the page content and/or title. The search is case-insensitive and you can use * as a wildcard, i.e. 'check*' will match both 'checkbox' and 'checkmate'. You can combine words with the AND and OR operators: searching for 'foo AND bar' will only list pages that include both words, whereas 'foo OR bar' will list pages that include either word. 'foo bar' is synonymous to 'foo AND bar'; if you want to search for 'foo bar' as a word, you have to quote it, i.e. '"foo bar"'. You can make more complex expressions by using parentheses, as in 'foo AND ( bar OR baz )'. Note that ( and ) should be separate words, so something like 'foo AND (bar OR baz)' won't do what you expect. If you use multiple operators without using parentheses, AND will take precendence over OR, so 'foo AND bar OR baz' is equivalent to '( foo AND bar ) OR baz'. Words shorter than four letters and some common words (so-called stop words) can't be searched for; if you try, an error message will be shown. A list of stop words can be found at http://dev.mysql.com/doc/refman/5.0/en/fulltext-stopwords.html .
 50+
 51+The "Search in categories" text boxes work the same way as their title/content counterparts, except that 'foo bar' isn't interpreted as 'foo AND bar', so you don't need to quote category names with spaces in them.
 52+
 53+In the "Namespaces" section, you can specify the namespaces to search in. By default, only the main namespace is searched; you can change the default in your preferences.
 54+
 55+If you've configured it (see section 5A), the "Common categories" section will appear, containing the checkboxes and dropdown you configured. The checkboxes you check and the category you choose in the dropdown are added to the "List articles in these categories" field: if you enter "foo" in the text box, check the "bar" and "baz" checkboxes and select "foobar" in the dropdown, the complete search query will be 'foo AND ( bar OR baz ) AND foobar'.
 56+
 57+If you check the permanent link checkbox, a permanent link of the form Special:AdvancedSearch/123 will be generated. Clicking this link will fill out the form exactly like you did and run the search again. This is useful when referring to searches, e.g. on talk pages.
 58+
 59+5. CUSTOMIZING SPECIAL:ADVANCEDSEARCH
 60+You can customize the search form by adding the common categories box, or by changing the text that appears on the form.
 61+
 62+5A. COMMON CATEGORIES (CHECKBOXES)
 63+You can add the common categories checkboxes by adding the following line to LocalSettings.php:
 64+
 65+$wgAdvancedSearchSpeedCats = array('Living people' => 'Alive', 'Deceased people' => 'Dead');
 66+
 67+This will add two checkboxes, one labeled "Alive" corresponding to [[Category:Living people]] and one labeled "Dead" corresponding to [[Category:Deceased people]]. Checking e.g. only the "Alive" checkbox will only list pages in [[Category:Living people]], checking both checkboxes will list pages in either category.
 68+
 69+5B. COMMON CATEGORIES (DROPDOWN)
 70+Additionally, you can add a dropdown to the common categories box. This will allow you to select one category, and will only list pages in that category. To add a dropdown, add the following to LocalSettings.php:
 71+
 72+$wgAdvancedSearchSpeedCatsDropDown = array('2000', '2001', '2002', '2000s' => '2000 OR 2001 OR 2002');
 73+
 74+This will create a dropdown with the choices 2000, 2001, 2002 and 2000s (in that order). Selecting '2000s' will list pages in either of the 2000, 2001 and 2002 categories.
 75+
 76+5C. CHANGING INTERFACE MESSAGES
 77+Interface messages are the texts displayed in the search form. You can change them by editing the corresponding MediaWiki: pages (i.e. to edit e.g. the advancedsearch-toptext message, edit [[MediaWiki:Advancedsearch-toptext]]). Note that customizing a message means that foreign users will no longer see the translation for the original message.
 78+
 79+The messages most suitable for customization are listed below. For a full list of messages used by AdvancedSearch, see extensions/AdvancedSearch/AdvancedSearch.i18n.php .
 80+
 81+MESSAGE FUNCTION

Status & tagging log