r102136 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r102135‎ | r102136 | r102137 >
Date:01:09, 6 November 2011
Author:danwe
Status:deferred (Comments)
Tags:
Comment:
Version 1.0.1, configuration variables and settings file added, some cleanup done.
Modified paths:
  • /trunk/extensions/RegexFun/RELEASE-NOTES (modified) (history)
  • /trunk/extensions/RegexFun/RegexFun.i18n.php (modified) (history)
  • /trunk/extensions/RegexFun/RegexFun.php (modified) (history)
  • /trunk/extensions/RegexFun/RegexFun_Settings.php (added) (history)

Diff [purge]

Index: trunk/extensions/RegexFun/RELEASE-NOTES
@@ -1,10 +1,14 @@
22 Changelog:
33 ==========
4 - * (trunk) 2011 -- version 1.0.1
5 - - Bug in '#regex_var' solved: default value now gets returned in case '#regex' went wront or
 4+ * November 6, 2011 -- Version 1.0.1
 5+ - Bug in '#regex_var' solved: default value now gets returned in case '#regex' went wrong or
66 not called before.
7 - - '#regexall' last parameter, 'length' can be empty '' which is the specified default now. It
8 - simply means there is not limit and all items should be returned.
 7+ - '#regexall' last parameter, 'length', can be empty '' which is the specified default now. It
 8+ simply means there is no limit and all items should be returned ('-1' has another meaning).
 9+ - Introduces two global configuration variables:
 10+ + '$egRegexFunDisabledFunctions' - to disable certain functions within the wiki.
 11+ + '$egRegexFunMaxRegexPerParse' - limit for number of function calls per parser process.
 12+ - Some minor cleanup done.
913
1014 * November 4, 2011 -- Version 1.0 (initial public release).
1115 Introduces the following parser functions defined within 'ExtRegexFun' class:
@@ -18,22 +22,22 @@
1923 - Replacing within strings, using regular expression.
2024 - Allows save use of user input within expressions by running '#regexquote' parser function
2125 over it. An important function other regex extensions still lack.
22 - - Allows to get the last '#regex' subexpression matches via '#regex_var', even allows to
 26+ - Allows to get the last '#regex' sub-expression matches via '#regex_var', even allows to
2327 get them in an extensive way, e.g. "$0 has $2, $1 and $3".
2428 - Invalid regex will result in an inline error message instead of php notice as some other
2529 regex extensions might do it.
26 - - Efficient regex validation allowing all kinds of delimitiers and flags but filtering 'e'
 30+ - Efficient regex validation allowing all kinds of delimiters and flags but filtering 'e'
2731 flag for security reasons in any case...
2832 - ... therefore, original 'e' flag instead has another but very similar meaning adjusted for
29 - a mediawiki context. Instead of executing php code within the replacement string, the 'e'
 33+ a MediaWiki context. Instead of executing php code within the replacement string, the 'e'
3034 flag now causes the replacement string to be parsed after references ('$1', '\1') are
3135 replaced. This allows stuff like "{{((}}Template{{!}}$1{{))}}" within the replacement.
3236
3337 Changes since earlier versions (trunk and earlier, non-public):
3438 - '#regexsearch' parser function removed. Instead there is a special flag 'r' now which leads
3539 to the same result if #regex and replacement is being used: '' as output if nothing replaced.
36 - - '#regexascii' parser function removed. Instead '#regexquote' will make an ascii-quote MW
37 - special characters ';' and '#' if they are first character in the string.
 40+ - '#regexascii' parser function removed. Instead '#regexquote' will ascii-quote MW special
 41+ characters ';' and '#' if they are first character in the string.
3842 - '#regexquote' delimiter set to '/' by default.
3943 - '#regex' no longer returns its value as parsed wikitext (option 'noparse' => false) instead
4044 the 'e' flag can be used (although not exactly the same).
Index: trunk/extensions/RegexFun/RegexFun.i18n.php
@@ -18,6 +18,7 @@
1919 $messages['en'] = array(
2020 'regexfun-desc' => 'Adds parser functions allowing the use of regular expressions within wiki pages',
2121 'regexfun-invalid' => 'The regular expression "$1" is invalid.',
 22+ 'regexfun-limit-exceed' => 'Maximum of $1 "Regex Fun" regular expression handlings reached.',
2223 );
2324
2425 /** German (Deutsch)
@@ -26,6 +27,7 @@
2728 $messages['de'] = array(
2829 'regexfun-desc' => 'Fügt Parser-Funktionen hinzu, um reguläre Ausdrücke auf Wiki-Seiten verwenden zu können',
2930 'regexfun-invalid' => '„$1“ ist kein gültiger regulärer Ausdruck.',
 31+ 'regexfun-limit-exceed' => 'Maximale Anzahl von $1 durch „Regex Fun“ behandelte reguläre Ausdrücke erreicht.',
3032 );
3133
3234 /** Galician (Galego)
Index: trunk/extensions/RegexFun/RegexFun.php
@@ -20,6 +20,7 @@
2121
2222 if ( ! defined( 'MEDIAWIKI' ) ) { die( ); }
2323
 24+
2425 /**** extension info ****/
2526
2627 $wgExtensionCredits['parserhook'][] = array(
@@ -36,7 +37,10 @@
3738
3839 $wgHooks['ParserFirstCallInit'][] = 'ExtRegexFun::init';
3940 $wgHooks['ParserClearState' ][] = 'ExtRegexFun::onParserClearState';
 41+$wgHooks['ParserLimitReport' ][] = 'ExtRegexFun::onParserLimitReport';
4042
 43+// Include the settings file:
 44+require_once ExtRegexFun::getDir() . '/RegexFun_Settings.php';
4145
4246 /**
4347 * Extension class with all the regex functions functionality
@@ -54,18 +58,29 @@
5559 */
5660 const VERSION = '1.0.1';
5761
58 - /**
59 - * Sets up parser functions
60 - */
61 - public static function init( &$parser ) {
62 - $parser->setFunctionHook( 'regex', array( __CLASS__, 'regex' ) );
63 - $parser->setFunctionHook( 'regex_var', array( __CLASS__, 'regex_var' ) );
64 - $parser->setFunctionHook( 'regexall', array( __CLASS__, 'regexall' ) );
65 - $parser->setFunctionHook( 'regexquote', array( __CLASS__, 'regexquote' ) );
66 - $parser->setFunctionHook( 'regexascii', array( __CLASS__, 'regexascii' ) );
 62+ /**
 63+ * Sets up parser functions
 64+ */
 65+ public static function init( Parser &$parser ) {
 66+ self::initFunction( $parser, 'regex' );
 67+ self::initFunction( $parser, 'regex_var' );
 68+ self::initFunction( $parser, 'regexquote' );
 69+ self::initFunction( $parser, 'regexall' );
6770
6871 return true;
69 - }
 72+ }
 73+ private static function initFunction( Parser &$parser, $name, $functionCallback = null ) {
 74+ if( $functionCallback === null ) {
 75+ $functionCallback = array( __CLASS__, $name );
 76+ }
 77+
 78+ global $egRegexFunDisabledFunctions;
 79+
 80+ // only register function if not disabled by configuration
 81+ if( ! in_array( $name, $egRegexFunDisabledFunctions ) ) {
 82+ $parser->setFunctionHook( $name, $functionCallback );
 83+ }
 84+ }
7085
7186 /**
7287 * Returns the extensions base installation directory.
@@ -94,6 +109,7 @@
95110 */
96111 private static $tmpRegexCB;
97112
 113+
98114 /**
99115 * Checks whether the given regular expression is valid or would cause an error.
100116 * Also alters the pattern in case it would be a security risk and communicates
@@ -179,6 +195,15 @@
180196 return false;
181197 }
182198
 199+ private static function limitHandler( Parser &$parser ) {
 200+ // is the limit exceeded for this parsers parse() process?
 201+ if( self::limitExceeded( $parser ) ) {
 202+ return false;
 203+ }
 204+ self::increaseRegexCount( $parser );
 205+ return true;
 206+ }
 207+
183208 /**
184209 * Returns a valid parser function output that the given pattern is not valid for a regular
185210 * expression. The message can be displayed in the wiki and is wrapped in an error-class span
@@ -188,11 +213,17 @@
189214 *
190215 * @return Array
191216 */
192 - public static function invalidRegexParsingOutput( $pattern ) {
193 - $msg = '<span class="error">' . wfMsgExt( 'regexfun-invalid', array( 'content' ), "<tt><nowiki>$pattern</nowiki></tt>" ). '</span>';
194 - return array( $msg, 'noparse' => true, 'isHTML' => false ); // isHTML must be false for #iferror!
 217+ protected static function msgInvalidRegex( $pattern ) {
 218+ $msg = '<span class="error">' . wfMsgForContent( 'regexfun-invalid', "<tt><nowiki>$pattern</nowiki></tt>" ). '</span>';
 219+ return array( $msg, 'noparse' => false, 'isHTML' => false ); // 'noparse' true for <nowiki>, 'isHTML' false for #iferror!
195220 }
196221
 222+ protected static function msgLimitExceeded() {
 223+ global $egRegexFunMaxRegexPerParse;
 224+ $msg = '<span class="error">' . wfMsgForContent( 'regexfun-limit-exceed', $egRegexFunMaxRegexPerParse ). '</span>';
 225+ return array( $msg, 'noparse' => true, 'isHTML' => false ); // 'isHTML' must be false for #iferror!
 226+ }
 227+
197228 /**
198229 * Helper function. Validates regex and takes care of security risks in pattern which is why
199230 * the pattern is taken by reference!
@@ -212,21 +243,26 @@
213244 }
214245
215246 /**
216 - * Performs a regular expression search or replacement
 247+ * Performs a regular expression search or replacement
217248 *
218 - * @param $parser Parser instance of running Parse
219 - * @param $subject String input string to evaluate
220 - * @param $pattern String regular expression pattern - must use /, | or % delimiter
221 - * @param $replace String regular expression replacement
 249+ * @param $parser Parser instance of running Parse
 250+ * @param $subject String input string to evaluate
 251+ * @param $pattern String regular expression pattern - must use /, | or % delimiter
 252+ * @param $replace String regular expression replacement
222253 *
223 - * @return String Result of replacing pattern with replacement in string, or matching text if replacement was omitted
224 - */
 254+ * @return String Result of replacing pattern with replacement in string, or matching text if replacement was omitted
 255+ */
225256 public static function regex( Parser &$parser, $subject = '', $pattern = '', $replace = null, $limit = -1 ) {
 257+ // check whether limit exceeded:
 258+ if( self::limitExceeded( $parser ) ) {
 259+ return self::msgLimitExceeded();
 260+ }
 261+ self::increaseRegexCount( $parser );
226262
227263 // validate, initialise and check for wrong input:
228264 $continue = self::validateRegexCall( $parser, $subject, $pattern, $specialFlags, true );
229265 if( ! $continue ) {
230 - return self::invalidRegexParsingOutput( $pattern );;
 266+ return self::msgInvalidRegex( $pattern );
231267 }
232268
233269 if( $replace === null ) {
@@ -302,11 +338,17 @@
303339 *
304340 * @return String result of all matching text parts separated by a string
305341 */
306 - public static function regexall( &$parser , $subject = '' , $pattern = '' , $separator = ', ' , $offset = 0 , $length = '' ) {
 342+ public static function regexall( &$parser , $subject = '' , $pattern = '' , $separator = ', ' , $offset = 0 , $length = '' ) {
 343+ // check whether limit exceeded:
 344+ if( self::limitExceeded( $parser ) ) {
 345+ return self::msgLimitExceeded();
 346+ }
 347+ self::increaseRegexCount( $parser );
 348+
307349 // validate and check for wrong input:
308350 $continue = self::validateRegexCall( $parser, $subject, $pattern, $specialFlags, false );
309351 if( ! $continue ) {
310 - return self::invalidRegexParsingOutput( $pattern );;
 352+ return self::msgInvalidRegex( $pattern );;
311353 }
312354
313355 // adjust default values:
@@ -334,13 +376,13 @@
335377 return '';
336378 }
337379
338 - /**
339 - * Returns a value from the last performed regex match
 380+ /**
 381+ * Returns a value from the last performed regex match
340382 *
341 - * @index $parser Parser instance of running Parser
342 - * @param $index Integer index of the last match which should be returnd or a string containing $n as indexes to be replaced
343 - * @param $defaultVal Integer default value which will be returned when the result with the given index doesn't exist or is a void string
344 - */
 383+ * @index $parser Parser instance of running Parser
 384+ * @param $index Integer index of the last match which should be returnd or a string containing $n as indexes to be replaced
 385+ * @param $defaultVal Integer default value which will be returned when the result with the given index doesn't exist or is a void string
 386+ */
345387 public static function regex_var( &$parser, $index = 0, $defaultVal = '' ) {
346388 // get matches from last #regex
347389 $lastMatches = self::getLastMatches( $parser );
@@ -360,6 +402,13 @@
361403 }
362404 } else {
363405 // complex string is given, something like "$1, $2 and $3":
 406+
 407+ // limit check, only in complex mode:
 408+ if( self::limitExceeded( $parser ) ) {
 409+ return self::msgLimitExceeded();
 410+ }
 411+ self::increaseRegexCount( $parser );
 412+
364413 /*
365414 * replace all back-references with their number increased by 1!
366415 * this way we can also handle $0 in the right way!
@@ -396,16 +445,16 @@
397446 }
398447 return preg_replace( '%\d+%', (int)$index + 1, $full );
399448 }
400 -
401 - /**
402 - * takes $str and puts a backslash in front of each character that is part of the regular expression syntax
 449+
 450+ /**
 451+ * takes $str and puts a backslash in front of each character that is part of the regular expression syntax
403452 *
404 - * @param $parser Parser instance of running Parser
405 - * @param $str String input string to change
406 - * @param $delimiter String delimiter which also will be escaped within $str (default is set to '/')
 453+ * @param $parser Parser instance of running Parser
 454+ * @param $str String input string to change
 455+ * @param $delimiter String delimiter which also will be escaped within $str (default is set to '/')
407456 *
408 - * @return String Returns the quoted string
409 - */
 457+ * @return String Returns the quoted string
 458+ */
410459 public static function regexquote( &$parser, $str = null, $delimiter = '/' ) {
411460 if( $str === null ) {
412461 return '';
@@ -432,11 +481,26 @@
433482 return $str;
434483 }
435484
 485+ public static function onParserLimitReport( $parser, &$report ) {
 486+ global $egRegexFunMaxRegexPerParse;
 487+ $count = self::getLimitCount( $parser );
 488+
 489+ $report .= 'ExtRegexFun count: ';
 490+
 491+ if( $egRegexFunMaxRegexPerParse !== -1 ) {
 492+ $report .= "{$count}/{$egRegexFunMaxRegexPerParse}\n";
 493+ }
 494+ else {
 495+ $report .= "{$count}\n";
 496+ }
 497+ return true;
 498+ }
436499
437 - /*********************************
438 - **** HELPER - For Store of ****
439 - **** regex_var within Parser ****
440 - *********************************
 500+
 501+ /***********************************
 502+ **** HELPER - For store of ****
 503+ **** regex stuff within Parser ****
 504+ ***********************************
441505 ****
442506 **
443507 * Adding the info to each Parser object makes it invulnerable to new Parser objects being created
@@ -454,14 +518,41 @@
455519
456520 public static function onParserClearState( &$parser ) {
457521 //cleanup to avoid conflicts with job queue or Special:Import
 522+ $parser->mExtRegexFun = array();
458523 self::setLastMatches( $parser, null );
459524 self::setLastPattern( $parser, '' );
460 - self::setLastSubject( $parser, '' );
 525+ self::setLastSubject( $parser, '' );
 526+ $parser->mExtRegexFun['counter'] = 0;
461527
462528 return true;
463529 }
464530
465531 /**
 532+ * Returns whether the maximum limit of regular expression has been exceeded
 533+ * for the given parser objects current Parser::parse() process.
 534+ *
 535+ * @return boolean
 536+ */
 537+ public static function limitExceeded( Parser &$parser ) {
 538+ global $egRegexFunMaxRegexPerParse;
 539+ return (
 540+ $egRegexFunMaxRegexPerParse !== -1
 541+ && $parser->mExtRegexFun['counter'] >= $egRegexFunMaxRegexPerParse
 542+ );
 543+ }
 544+
 545+ public static function getLimitCount( Parser &$parser ) {
 546+ if( isset( $parser->mExtRegexFun['counter'] ) ) {
 547+ return $parser->mExtRegexFun['counter'];
 548+ }
 549+ return 0;
 550+ }
 551+
 552+ private static function increaseRegexCount( Parser &$parser ) {
 553+ $parser->mExtRegexFun['counter']++;
 554+ }
 555+
 556+ /**
466557 * Returns the last regex matches done by #regex in the context of the same parser object.
467558 *
468559 * @param Parser $parser
Index: trunk/extensions/RegexFun/RegexFun_Settings.php
@@ -0,0 +1,44 @@
 2+<?php
 3+
 4+/**
 5+ * File defining the settings for the 'Regex Fun' extension.
 6+ * More info can be found at http://www.mediawiki.org/wiki/Extension:Regex_Fun#Configuration
 7+ *
 8+ * NOTICE:
 9+ * =======
 10+ * Changing one of these settings can be done by copying and placing
 11+ * it in LocalSettings.php, AFTER the inclusion of 'Regex Fun'.
 12+ *
 13+ * @file RegexFun_Settings.php
 14+ * @ingroup RegexFun
 15+ * @since 1.0.1
 16+ *
 17+ * @author Daniel Werner
 18+ */
 19+
 20+/**
 21+ * Allows to define functions which should not be available within the wiki.
 22+ *
 23+ * @example
 24+ * # disable '#regexall' and '#regex_var' functions:
 25+ * $egRegexFunDisabledFunctions = array( 'regexall', 'regex_var' );
 26+ *
 27+ * @since 1.0.1
 28+ * @var array
 29+ */
 30+$egRegexFunDisabledFunctions = array();
 31+
 32+
 33+/**
 34+ * Defines the maximum regular expression executions per parser process. This
 35+ * counts all executed regular expression usages by this extension. The counter
 36+ * will be increased by '#regex', '#regexall' and '#regex_var' if a reference
 37+ * string is given but not if only a index is requested. '#regexquote' is not
 38+ * affected. When the limit is exceeded, a '#iferror' catchable error message
 39+ * will be put out instead of the result of the function.
 40+ * The limit can be set to -1 to disable the limit (default).
 41+ *
 42+ * @since 1.0.1
 43+ * @var integer
 44+ */
 45+$egRegexFunMaxRegexPerParse = -1;
Property changes on: trunk/extensions/RegexFun/RegexFun_Settings.php
___________________________________________________________________
Added: svn:eol-style
146 + native

Follow-up revisions

RevisionCommit summaryAuthorDate
r102272FU r102136: Use PLURAL, add message documentation, run number through formatN...raymond08:31, 7 November 2011

Comments

#Comment by Raymond (talk | contribs)   07:51, 6 November 2011

Please add message documentation for the newly added messages. Thanks.. What is $1?

#Comment by Danwe (talk | contribs)   16:26, 6 November 2011

Sorry, but the description for that message documentation is kind of loose. Where exactly should this documentation be placed? I don't believe in i18n.php file since I have never noticed anything like this in any other extension. I just logged on to the translate wiki.

in 'regexfun-limit-exceed' $1 is the in localsettings.php defined limit of total Regex Fun function calls (per parser process) dealing with regular expressions.

#Comment by Raymond (talk | contribs)   08:35, 7 November 2011

Thanks for the explanation. I have added it in the 'qqq' section as pseudo language in r102272. Furthermore I added the PLURAL magic word for proper l10n. It is neither important for English nor for German but other languages have very complex plural rules.

Status & tagging log