r105134 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r105133‎ | r105134 | r105135 >
Date:23:15, 4 December 2011
Author:danwe
Status:deferred
Tags:
Comment:
Tag for 'Regex Fun' 1.0.2
Modified paths:
  • /tags/extensions/RegexFun/REL_1_0_2 (added) (history)

Diff [purge]

Index: tags/extensions/RegexFun/REL_1_0_2/RELEASE-NOTES
@@ -0,0 +1,52 @@
 2+ Changelog:
 3+ ==========
 4+
 5+ * December 5, 2011 -- Version 1.0.2
 6+ - Limit won't exceed early when 'e' flag with many backrefs in replacement is used extensivelly.
 7+ - It's possible to use the 'Regex Fun' regex system with advanced flags within other extensions.
 8+ - Performance increased for executing huge numbers of the same regex on different strings.
 9+ - Internal representative functions for parser functions now have a 'pf_' prefix.
 10+
 11+ * November 6, 2011 -- Version 1.0.1
 12+ - Bug in '#regex_var' solved: default value now gets returned in case '#regex' went wrong or
 13+ not called before.
 14+ - '#regexall' last parameter, 'length', can be empty '' which is the specified default now. It
 15+ simply means there is no limit and all items should be returned ('-1' has another meaning).
 16+ - Introduces two global configuration variables:
 17+ + '$egRegexFunDisabledFunctions' - to disable certain functions within the wiki.
 18+ + '$egRegexFunMaxRegexPerParse' - limit for number of function calls per parser process.
 19+ - Some minor cleanup done.
 20+
 21+ * November 4, 2011 -- Version 1.0 (initial public release).
 22+ Introduces the following parser functions defined within 'ExtRegexFun' class:
 23+ - #regex
 24+ - #regexall
 25+ - #regex_var
 26+ - #regexquote
 27+
 28+ Main features:
 29+ - Searching within strings, using regular expression.
 30+ - Replacing within strings, using regular expression.
 31+ - Allows save use of user input within expressions by running '#regexquote' parser function
 32+ over it. An important function other regex extensions still lack.
 33+ - Allows to get the last '#regex' sub-expression matches via '#regex_var', even allows to
 34+ get them in an extensive way, e.g. "$0 has $2, $1 and $3".
 35+ - Invalid regex will result in an inline error message instead of php notice as some other
 36+ regex extensions might do it.
 37+ - Efficient regex validation allowing all kinds of delimiters and flags but filtering 'e'
 38+ flag for security reasons in any case...
 39+ - ... therefore, original 'e' flag instead has another but very similar meaning adjusted for
 40+ a MediaWiki context. Instead of executing php code within the replacement string, the 'e'
 41+ flag now causes the replacement string to be parsed after references ('$1', '\1') are
 42+ replaced. This allows stuff like "{{((}}Template{{!}}$1{{))}}" within the replacement.
 43+
 44+ Changes since earlier versions (trunk and earlier, non-public):
 45+ - '#regexsearch' parser function removed. Instead there is a special flag 'r' now which leads
 46+ to the same result if #regex and replacement is being used: '' as output if nothing replaced.
 47+ - '#regexascii' parser function removed. Instead '#regexquote' will ascii-quote MW special
 48+ characters ';' and '#' if they are first character in the string.
 49+ - '#regexquote' delimiter set to '/' by default.
 50+ - '#regex' no longer returns its value as parsed wikitext (option 'noparse' => false) instead
 51+ the 'e' flag can be used (although not exactly the same).
 52+ - contributed under ISC License, maintained in wikimedia.org svn.
 53+
\ No newline at end of file
Property changes on: tags/extensions/RegexFun/REL_1_0_2/RELEASE-NOTES
___________________________________________________________________
Added: svn:eol-style
154 + native
Index: tags/extensions/RegexFun/REL_1_0_2/RegexFun.i18n.magic.php
@@ -0,0 +1,21 @@
 2+<?php
 3+#coding: utf-8
 4+
 5+/**
 6+ * Internationalization file for magic words of the 'Regex Fun' extension.
 7+ *
 8+ * @since 1.0
 9+ *
 10+ * @file RegexFun.i18n.magic.php
 11+ * @ingroup RegexFun
 12+ * @author Daniel Werner < danweetz@web.de >
 13+ */
 14+
 15+$magicWords = array();
 16+
 17+$magicWords['en'] = array(
 18+ 'regex' => array( 0, 'regex' ),
 19+ 'regex_var' => array( 0, 'regex_var' ),
 20+ 'regexall' => array( 0, 'regexall' ),
 21+ 'regexquote' => array( 0, 'regexquote' ),
 22+);
\ No newline at end of file
Property changes on: tags/extensions/RegexFun/REL_1_0_2/RegexFun.i18n.magic.php
___________________________________________________________________
Added: svn:eol-style
123 + native
Index: tags/extensions/RegexFun/REL_1_0_2/RegexFun.i18n.php
@@ -0,0 +1,122 @@
 2+<?php
 3+
 4+/**
 5+ * Internationalisation file of the 'Regex Fun' extension
 6+ *
 7+ * @since 1.0
 8+ *
 9+ * @file RegexFun.i18n.php
 10+ * @ingroup RegexFun
 11+ * @author Daniel Werner < danweetz@web.de >
 12+ */
 13+
 14+$messages = array();
 15+
 16+/** English
 17+ * @author Daniel Werner
 18+ */
 19+$messages['en'] = array(
 20+ 'regexfun-desc' => 'Adds parser functions allowing the use of regular expressions within wiki pages',
 21+ 'regexfun-invalid' => 'The regular expression "$1" is invalid.',
 22+ 'regexfun-limit-exceed' => 'Maximum of {{PLURAL:$1|$1 "Regex Fun" regular expression handling|$1 "Regex Fun" regular expression handlings}} reached.',
 23+);
 24+
 25+/** Message documentation (Message documentation)
 26+ * @author Daniel Werner
 27+ */
 28+$messages['qqq'] = array(
 29+ 'regexfun-limit-exceed' => '$1 is the in LocalSettings.php defined limit of total Regex Fun function calls (per parser process) dealing with regular expressions.',
 30+);
 31+
 32+/** German (Deutsch)
 33+ * @author Daniel Werner
 34+ * @author Kghbln
 35+ */
 36+$messages['de'] = array(
 37+ 'regexfun-desc' => 'Ergänzt Parserfunktionen, die die Verwendung regulärer Ausdrücke auf Wiki-Seiten ermöglichen',
 38+ 'regexfun-invalid' => '„$1“ ist kein gültiger regulärer Ausdruck.',
 39+ 'regexfun-limit-exceed' => 'Die maximale Anzahl von $1, durch „Regex Fun“ behandelten, regulären Ausdrücken ist erreicht.',
 40+);
 41+
 42+/** French (Français)
 43+ * @author Gomoko
 44+ */
 45+$messages['fr'] = array(
 46+ 'regexfun-desc' => "Ajoute les fonctions d'analyse permettant l'utilisation d'expressions régulières dans les pages du wiki",
 47+ 'regexfun-invalid' => 'L\'expression régulière "$1" n\'est pas valide.',
 48+ 'regexfun-limit-exceed' => 'Le nombre maximal de $1 expressions régulières gérées "Regex Fun" a été atteint.',
 49+);
 50+
 51+/** Galician (Galego)
 52+ * @author Toliño
 53+ */
 54+$messages['gl'] = array(
 55+ 'regexfun-desc' => 'Engade funcións analíticas que permiten o uso de expresións regulares nas páxinas wiki',
 56+ 'regexfun-invalid' => 'A expresión regular "$1" non é válida.',
 57+ 'regexfun-limit-exceed' => 'Atinxiuse o número máximo {{PLURAL:$1|dunha manipulación de expresión regular "Regex Fun"|de $1 manipulacións de expresións regulares "Regex Fun"}}.',
 58+);
 59+
 60+/** Upper Sorbian (Hornjoserbsce)
 61+ * @author Michawiki
 62+ */
 63+$messages['hsb'] = array(
 64+ 'regexfun-desc' => 'Přidawa parserowe funkcije, kotrež wužiwanje regularnych wurazow na wikistronach dowoleja',
 65+ 'regexfun-invalid' => 'Regularny wuraz "$1" je njepłaćiwy.',
 66+ 'regexfun-limit-exceed' => 'Maksimalna licba {{PLURAL:$1|$1 přez "Regex Fun" wobdźěłaneho regularneho wuraza|$1 přez "Regex Fun" wobdźěłaneju regularneju wurazow|$1 přez "Regex Fun" wobdźěłanych regularnych wurazow|$1 přez "Regex Fun" wobdźěłanych regularnych wurazow}} je docpěta.',
 67+);
 68+
 69+/** Interlingua (Interlingua)
 70+ * @author McDutchie
 71+ */
 72+$messages['ia'] = array(
 73+ 'regexfun-desc' => 'Adder functiones al analysator syntactic que permitte le uso de expressiones regular intra paginas wiki',
 74+ 'regexfun-invalid' => 'Le expression regular "$1" es invalide.',
 75+ 'regexfun-limit-exceed' => 'Le maximo de $1 processamentos de expression regular "Regex Fun" ha essite attingite.',
 76+);
 77+
 78+/** Japanese (日本語)
 79+ * @author Fryed-peach
 80+ */
 81+$messages['ja'] = array(
 82+ 'regexfun-desc' => 'ウィキページ内で正規表現の使用を可能にするパーサー関数を追加する',
 83+ 'regexfun-invalid' => '正規表現「$1」は不正です。',
 84+ 'regexfun-limit-exceed' => 'Regex Fun の正規表現処理最大数 {{PLURAL:$1|$1}} に達しました。',
 85+);
 86+
 87+/** Macedonian (Македонски)
 88+ * @author Bjankuloski06
 89+ */
 90+$messages['mk'] = array(
 91+ 'regexfun-desc' => 'Додава парсерски функции што овозможуваат употреба на регуларни изрази во вики-страници',
 92+ 'regexfun-invalid' => 'Регуларниот израз „$1“ е неважечки.',
 93+ 'regexfun-limit-exceed' => 'Достигнат е максимумот од $1 регуларни изрази сработени со „Regex Fun“.',
 94+);
 95+
 96+/** Malay (Bahasa Melayu)
 97+ * @author Anakmalaysia
 98+ */
 99+$messages['ms'] = array(
 100+ 'regexfun-desc' => 'Menambahkan fungsi-fungsi penghurai yang membolehkan penggunaan ungkapan nalar dalam laman wiki',
 101+ 'regexfun-invalid' => 'Ungkapan nalar "$1" tidak sah.',
 102+ 'regexfun-limit-exceed' => 'Had maksimum $1 kendalian ungkapan nalar "Regex Fun" tercapai.',
 103+);
 104+
 105+/** Dutch (Nederlands)
 106+ * @author Siebrand
 107+ * @author Tjcool007
 108+ */
 109+$messages['nl'] = array(
 110+ 'regexfun-desc' => "Voegt parserfuncties toe die mogelijk maken om reguliere expressies te gebruiken in wikipagina's",
 111+ 'regexfun-invalid' => 'De reguliere expressie "$1" is ongeldig.',
 112+ 'regexfun-limit-exceed' => 'Het maximale aantal af te handelen reguliere expressies is bereikt ($1).',
 113+);
 114+
 115+/** Norwegian (bokmål)‬ (‪Norsk (bokmål)‬)
 116+ * @author Event
 117+ */
 118+$messages['no'] = array(
 119+ 'regexfun-desc' => 'Legg til parserfunksjoner som tillater bruk av regulæruttrykk på wikisider',
 120+ 'regexfun-invalid' => 'Regulæruttrykket "$1" er ugyldig.',
 121+ 'regexfun-limit-exceed' => 'Det maksimalt antallet på {{PLURAL:$1|$1 "Regex Fun"-regulæruttrykk|$1 "Regex Fun"-regulæruttrykk}} er nådd.',
 122+);
 123+
Property changes on: tags/extensions/RegexFun/REL_1_0_2/RegexFun.i18n.php
___________________________________________________________________
Added: svn:eol-style
1124 + native
Index: tags/extensions/RegexFun/REL_1_0_2/COPYING
@@ -0,0 +1,13 @@
 2+Copyright (c) 2010 - 2011 by Daniel Werner < danweetz@web.de >
 3+
 4+Permission to use, copy, modify, and/or distribute this software for any
 5+purpose with or without fee is hereby granted, provided that the above
 6+copyright notice and this permission notice appear in all copies.
 7+
 8+THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 9+WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 10+MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 11+ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 12+WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 13+ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 14+OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
\ No newline at end of file
Property changes on: tags/extensions/RegexFun/REL_1_0_2/COPYING
___________________________________________________________________
Added: svn:eol-style
115 + native
Index: tags/extensions/RegexFun/REL_1_0_2/RegexFun.php
@@ -0,0 +1,706 @@
 2+<?php
 3+
 4+/**
 5+ * 'Regex Fun' is a MediaWiki extension which adds parser functions for performing regular
 6+ * expression searches and replacements.
 7+ *
 8+ * Documentation: http://www.mediawiki.org/wiki/Extension:Regex_Fun
 9+ * Support: http://www.mediawiki.org/wiki/Extension_talk:Regex_Fun
 10+ * Source code: http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/RegexFun
 11+ *
 12+ * @version: 1.0.2
 13+ * @license: ISC license
 14+ * @author: Daniel Werner < danweetz@web.de >
 15+ *
 16+ * @file RegexFun.php
 17+ * @ingroup RegexFun
 18+ */
 19+
 20+if ( ! defined( 'MEDIAWIKI' ) ) { die( ); }
 21+
 22+
 23+/**** extension info ****/
 24+
 25+$wgExtensionCredits['parserhook'][] = array(
 26+ 'path' => __FILE__,
 27+ 'name' => 'Regex Fun',
 28+ 'descriptionmsg' => 'regexfun-desc',
 29+ 'version' => ExtRegexFun::VERSION,
 30+ 'author' => '[http://www.mediawiki.org/wiki/User:Danwe Daniel Werner]',
 31+ 'url' => 'http://www.mediawiki.org/wiki/Extension:Regex_Fun',
 32+);
 33+
 34+// language files:
 35+$wgExtensionMessagesFiles['RegexFun' ] = ExtRegexFun::getDir() . '/RegexFun.i18n.php';
 36+$wgExtensionMessagesFiles['RegexFunMagic'] = ExtRegexFun::getDir() . '/RegexFun.i18n.magic.php';
 37+
 38+// hooks registration:
 39+$wgHooks['ParserFirstCallInit'][] = 'ExtRegexFun::init';
 40+$wgHooks['ParserClearState' ][] = 'ExtRegexFun::onParserClearState';
 41+$wgHooks['ParserLimitReport' ][] = 'ExtRegexFun::onParserLimitReport';
 42+
 43+// Include the settings file:
 44+require_once ExtRegexFun::getDir() . '/RegexFun_Settings.php';
 45+
 46+
 47+/**
 48+ * Extension class with all the regex functions functionality
 49+ *
 50+ * @since 1.0
 51+ */
 52+class ExtRegexFun {
 53+
 54+ /**
 55+ * Version of the 'RegexFun' extension.
 56+ *
 57+ * @since 1.0
 58+ *
 59+ * @var string
 60+ */
 61+ const VERSION = '1.0.2';
 62+
 63+ /**
 64+ * Sets up parser functions
 65+ */
 66+ public static function init( Parser &$parser ) {
 67+ self::initFunction( $parser, 'regex' );
 68+ self::initFunction( $parser, 'regex_var' );
 69+ self::initFunction( $parser, 'regexquote' );
 70+ self::initFunction( $parser, 'regexall' );
 71+
 72+ return true;
 73+ }
 74+ private static function initFunction( Parser &$parser, $name, $functionCallback = null ) {
 75+ if( $functionCallback === null ) {
 76+ $functionCallback = array( __CLASS__, "pf_{$name}" );
 77+ }
 78+
 79+ global $egRegexFunDisabledFunctions;
 80+
 81+ // only register function if not disabled by configuration
 82+ if( ! in_array( $name, $egRegexFunDisabledFunctions ) ) {
 83+ $parser->setFunctionHook( $name, $functionCallback );
 84+ }
 85+ }
 86+
 87+ /**
 88+ * Returns the extensions base installation directory.
 89+ *
 90+ * @since 1.0
 91+ *
 92+ * @return string
 93+ */
 94+ public static function getDir() {
 95+ static $dir = null;
 96+
 97+ if( $dir === null ) {
 98+ $dir = dirname( __FILE__ );
 99+ }
 100+ return $dir;
 101+ }
 102+
 103+
 104+ const FLAG_NO_REPLACE_NO_OUT = 'r';
 105+ const FLAG_REPLACEMENT_PARSE = 'e'; // overwrites php 'e' flag
 106+
 107+ /**
 108+ * helper store for transmitting some values to a preg_replace_callback function
 109+ *
 110+ * @var array
 111+ */
 112+ private static $tmpRegexCB;
 113+
 114+
 115+ /**
 116+ * Checks whether the given regular expression is valid or would cause an error.
 117+ * Also alters the pattern in case it would be a security risk and communicates
 118+ * about special flags which have no or different meaning in PHP. These will be
 119+ * removed from the original regex string but put into the &$specialFlags array.
 120+ *
 121+ * @since 1.0
 122+ *
 123+ * @param &$pattern String
 124+ * @param &$specialFlags array will contain all special flags the $pattern contains
 125+ *
 126+ * @return Boolean
 127+ */
 128+ public static function validateRegex( &$pattern, &$specialFlags = array() ) {
 129+
 130+ $specialFlags = array();
 131+
 132+ if( strlen( $pattern ) < 2 ) {
 133+ return false;
 134+ }
 135+
 136+ $delimiter = substr( trim( $pattern ), 0, 1 );
 137+ $delimiterQuoted = preg_quote( $delimiter, '/' );
 138+
 139+ // two parts, split by the last delimiter
 140+ $parts = preg_split( "/{$delimiterQuoted}(?=[^{$delimiterQuoted}]*$)/", $pattern, 2 );
 141+
 142+ $mainPart = $parts[0] . $delimiter; // delimiter to delimiter without flags
 143+ $flagsPart = $parts[1];
 144+
 145+ // remove 'e' modifier from final regex since it's a huge security risk with user input!
 146+ self::regexSpecialFlagsHandler( $flagsPart, self::FLAG_REPLACEMENT_PARSE, $specialFlags );
 147+
 148+ // marks #regex with replacement will output '' in case of no replacement
 149+ self::regexSpecialFlagsHandler( $flagsPart, self::FLAG_NO_REPLACE_NO_OUT, $specialFlags );
 150+
 151+ // put purified regex back together:
 152+ $newPattern = $mainPart . $flagsPart;
 153+
 154+ if( ! self::isValidRegex( $newPattern ) ) {
 155+ // no modification to $pattern done!
 156+ $specialFlags = array();
 157+ return false;
 158+ }
 159+ $pattern = $newPattern; // remember reference!
 160+ return true;
 161+ }
 162+
 163+ /**
 164+ * Returns whether the regular expression would be a valid one or not.
 165+ *
 166+ * @since 1.0
 167+ *
 168+ * @param $pattern string
 169+ *
 170+ * @return boolean
 171+ */
 172+ public static function isValidRegex( $pattern ) {
 173+ //return (bool)preg_match( '/^([\\/\\|%]).*\\1[imsSuUx]*$/', $pattern );
 174+ /*
 175+ * Testing of the pattern in a very simple way:
 176+ * This takes care of all invalid regular expression use and the ugly php notices
 177+ * which some other regex extensions for MW won't handle right.
 178+ */
 179+ wfSuppressWarnings(); // instead of using the evil @ operator!
 180+ $isValid = false !== preg_match( $pattern, ' ' ); // preg_match returns false on error
 181+ wfRestoreWarnings();
 182+
 183+ return $isValid;
 184+ }
 185+
 186+ /**
 187+ * Helper function to check a string of flags for a certain flag and set it as an array key
 188+ * in a special flags collecting array.
 189+ */
 190+ private static function regexSpecialFlagsHandler( &$modifiers, $flag, &$specialFlags ) {
 191+ $count = 0;
 192+ $modifiers = preg_replace( "/{$flag}/", '', $modifiers, -1, $count );
 193+ if( $count > 0 ) {
 194+ $specialFlags[ $flag ] = true;
 195+ return true;
 196+ }
 197+ return false;
 198+ }
 199+
 200+ private static function limitHandler( Parser &$parser ) {
 201+ // is the limit exceeded for this parsers parse() process?
 202+ if( self::limitExceeded( $parser ) ) {
 203+ return false;
 204+ }
 205+ self::increaseRegexCount( $parser );
 206+ return true;
 207+ }
 208+
 209+ /**
 210+ * Returns a valid parser function output that the given pattern is not valid for a regular
 211+ * expression. The message can be displayed in the wiki and is wrapped in an error-class span
 212+ * which can be recognized by #iferror
 213+ *
 214+ * @param $pattern String the invalid regular expression
 215+ *
 216+ * @return Array
 217+ */
 218+ protected static function msgInvalidRegex( $pattern ) {
 219+ $msg = '<span class="error">' . wfMsgForContent( 'regexfun-invalid', "<tt><nowiki>$pattern</nowiki></tt>" ). '</span>';
 220+ return array( $msg, 'noparse' => false, 'isHTML' => false ); // 'noparse' true for <nowiki>, 'isHTML' false for #iferror!
 221+ }
 222+
 223+ protected static function msgLimitExceeded() {
 224+ global $egRegexFunMaxRegexPerParse, $wgContLang;
 225+ $msg = '<span class="error">' . wfMsgForContent( 'regexfun-limit-exceed', $wgContLang->formatNum( $$egRegexFunMaxRegexPerParse ) ) . '</span>';
 226+ return array( $msg, 'noparse' => true, 'isHTML' => false ); // 'isHTML' must be false for #iferror!
 227+ }
 228+
 229+ /**
 230+ * Helper function. Validates regex and takes care of security risks in pattern which is why
 231+ * the pattern is taken by reference!
 232+ */
 233+ protected static function validateRegexCall( Parser &$parser, $subject, &$pattern, &$specialFlags, $resetLastRegex = false ) {
 234+ if( $resetLastRegex ) {
 235+ //reset last matches for the case anything goes wrong
 236+ self::setLastMatches( $parser , null );
 237+ }
 238+ if( ! self::validateRegex( $pattern, $specialFlags ) ) {
 239+ return false;
 240+ }
 241+ if( $resetLastRegex ) {
 242+ // store infos for this regex for '#regex_var'
 243+ self::initLastRegex( $parser, $pattern, $subject );
 244+ }
 245+ return true;
 246+ }
 247+
 248+ /**
 249+ * Performs a regular expression search or replacement
 250+ *
 251+ * @param $parser Parser instance of running Parse
 252+ * @param $subject String input string to evaluate
 253+ * @param $pattern String regular expression pattern - must use /, | or % delimiter
 254+ * @param $replacement String regular expression replacement
 255+ *
 256+ * @return String Result of replacing pattern with replacement in string, or matching text if replacement was omitted
 257+ */
 258+ public static function pf_regex( Parser &$parser, $subject = '', $pattern = '', $replacement = null, $limit = -1 ) {
 259+ // check whether limit exceeded:
 260+ if( self::limitExceeded( $parser ) ) {
 261+ return self::msgLimitExceeded();
 262+ }
 263+ self::increaseRegexCount( $parser );
 264+
 265+ if( $replacement === null ) {
 266+ // search mode:
 267+
 268+ // validate, initialise and check for wrong input:
 269+ $continue = self::validateRegexCall( $parser, $subject, $pattern, $specialFlags, true );
 270+ if( ! $continue ) {
 271+ return self::msgInvalidRegex( $pattern );
 272+ }
 273+
 274+ $lastMatches = self::getLastMatches( $parser );
 275+ $output = ( preg_match( $pattern, $subject, $lastMatches ) ? $lastMatches[0] : '' );
 276+ self::setLastMatches( $parser, $lastMatches );
 277+ }
 278+ else {
 279+ // replace mode:
 280+ $limit = (int)$limit;
 281+
 282+ // set last matches to 'false' and get them on demand instead since preg_replace won't communicate them
 283+ self::setLastMatches( $parser, false );
 284+
 285+ // do the regex plus all handling of special flags and validation
 286+ $output = self::doPregReplace( $pattern, $replacement, $subject, $limit, $parser );
 287+
 288+ if( $output === false ) {
 289+ // invalid regex, don't store any infor for '#regex_var'
 290+ self::setLastMatches( $parser , null );
 291+ return self::msgInvalidRegex( $pattern );
 292+ }
 293+
 294+ // set these infos only if valid, pattern still contains special flags though
 295+ self::setLastPattern( $parser, $pattern );
 296+ self::setLastSubject( $parser, $subject );
 297+ }
 298+
 299+ return $output;
 300+ }
 301+
 302+ /**
 303+ * 'preg_replace'-like function but can handle special modifiers 'e' and 'r'.
 304+ *
 305+ * @param string &$pattern
 306+ * @param string $replacement
 307+ * @param string $subject
 308+ * @param int $limit
 309+ * @param Parser &$parser if 'e' flag should be allowed, a parser object for parsing is required.
 310+ * @param array $allowedSpecialFlags all special flags that should be handled, by default 'e' and 'r'.
 311+ */
 312+ public static function doPregReplace(
 313+ $pattern, // not by value in here!
 314+ $replacement,
 315+ $subject,
 316+ $limit = -1,
 317+ &$parser = null,
 318+ array $allowedSpecialFlags = array(
 319+ self::FLAG_REPLACEMENT_PARSE,
 320+ self::FLAG_NO_REPLACE_NO_OUT,
 321+ )
 322+ ) {
 323+ static $lastPattern = null;
 324+ static $activePattern = null;
 325+ static $specialFlags = null;
 326+
 327+ /*
 328+ * cache validated pattern and use it as long as nothing has changed, this makes things
 329+ * faster in case we do a lot of stuff with the same regex.
 330+ */
 331+ if( $lastPattern === null || $lastPattern !== $pattern ) {
 332+ // remember pattern without validation
 333+ $lastPattern = $pattern;
 334+
 335+ // if allowed special flags change, we have to validate again^^
 336+ $lastFlags = implode( ',', $allowedSpecialFlags );
 337+
 338+ // validate regex and get special flags 'e' and 'r' if given:
 339+ if( ! self::validateRegex( $pattern, $specialFlags ) ) {
 340+ // invalid regex!
 341+ $lastPattern = null;
 342+ return false;
 343+ }
 344+ // set validated pattern as active one
 345+ $activePattern = $pattern;
 346+
 347+ // filter unwanted special flags:
 348+ $allowedSpecialFlags = array_flip( $allowedSpecialFlags );
 349+ $specialFlags = array_intersect_key( $specialFlags, $allowedSpecialFlags );
 350+ }
 351+ else {
 352+ // set last validated pattern without flags 'e' and 'r'
 353+ $pattern = $activePattern;
 354+ }
 355+
 356+
 357+ // FLAG 'e' (parse replace after match) handling:
 358+ if( ! empty( $specialFlags[ self::FLAG_REPLACEMENT_PARSE ] ) ) {
 359+
 360+ // 'e' requires a Parser for parsing!
 361+ if( ! ( $parser instanceof Parser ) ) {
 362+ // no valid Parser object, without, we can't parse anything!
 363+ throw new MWException( "Regex Fun 'e' flag discovered but no Parser object given!" );
 364+ }
 365+
 366+ // if 'e' flag is set, each replacement has to be parsed after matches are inserted but before replacing!
 367+ self::$tmpRegexCB = array(
 368+ 'replacement' => $replacement,
 369+ 'parser' => &$parser,
 370+ 'internal' => isset( $parser->mExtRegexFun['lastMatches'] ) && $parser->mExtRegexFun['lastMatches'] === false
 371+ );
 372+
 373+ $output = preg_replace_callback( $pattern, array( __CLASS__, 'doPregReplace_eFlag_callback' ), $subject, $limit, $count );
 374+ }
 375+ else {
 376+ // no 'e' flag, we can perform the standard function
 377+ $output = preg_replace( $pattern, $replacement, $subject, $limit, $count );
 378+ }
 379+
 380+
 381+ // FLAG 'r' (no replacement - no output) handling:
 382+ if( ! empty( $specialFlags[ self::FLAG_NO_REPLACE_NO_OUT ] ) ) {
 383+ /*
 384+ * only output replacement result if there actually was a match and therewith a replacement happened
 385+ * (otherwise the input string would be returned)
 386+ */
 387+ if( $count < 1 ) {
 388+ return '';
 389+ }
 390+ }
 391+
 392+ return $output;
 393+ }
 394+
 395+ private static function doPregReplace_eFlag_callback( $matches ) {
 396+
 397+ /** Don't cache this since it could contain dynamic content like #var which should be parsed */
 398+
 399+ $replace = self::$tmpRegexCB['replacement'];
 400+ $parser = self::$tmpRegexCB['parser'];
 401+ $internal = self::$tmpRegexCB['internal']; // whether doPregReplace() is called as part of a parser function
 402+
 403+ /*
 404+ * only do this if set to false before, internally, so we won't destroy things if
 405+ * doPregReplace() was called from outside 'Regex Fun'
 406+ */
 407+ if( $internal ) {
 408+ // last matches in #regex replace mode were set to false before, set them now:
 409+ self::setLastMatches( $parser, $matches );
 410+ }
 411+ // replace backrefs with their actual values:
 412+ $replace = self::regexVarReplace( $replace, $matches );
 413+
 414+ // parse the replacement after matches are inserted
 415+ // use a new frame, no need for SFH_OBJECT_ARGS style parser functions
 416+ $frame = $parser->getPreprocessor()->newCustomFrame( $parser );
 417+ $replace = $parser->preprocessToDom( $replace );
 418+ $replace = trim( $frame->expand( $replace ) );
 419+
 420+ return $replace;
 421+ }
 422+
 423+ /**
 424+ * Performs regular expression searches and returns ALL matches separated
 425+ *
 426+ * @param $parser Parser instance of running Parser
 427+ * @param $subject String input string to evaluate
 428+ * @param $pattern String regular expression pattern - must use /, | or % as delimiter
 429+ * @param $separator String to separate all the matches
 430+ * @param $offset Integer first match to print out. Negative values possible: -1 means last match.
 431+ * @param $length Integer maximum matches for print out
 432+ *
 433+ * @return String result of all matching text parts separated by a string
 434+ */
 435+ public static function pf_regexall( &$parser , $subject = '' , $pattern = '' , $separator = ', ' , $offset = 0 , $length = '' ) {
 436+ // check whether limit exceeded:
 437+ if( self::limitExceeded( $parser ) ) {
 438+ return self::msgLimitExceeded();
 439+ }
 440+ self::increaseRegexCount( $parser );
 441+
 442+ // validate and check for wrong input:
 443+ $continue = self::validateRegexCall( $parser, $subject, $pattern, $specialFlags, false );
 444+ if( ! $continue ) {
 445+ return self::msgInvalidRegex( $pattern );;
 446+ }
 447+
 448+ // adjust default values:
 449+ $offset = (int)$offset;
 450+
 451+ if( trim( $length ) === '' ) {
 452+ $length = null;
 453+ } else {
 454+ $length = (int)$length;
 455+ }
 456+
 457+ if( preg_match_all( $pattern, $subject, $matches, PREG_SET_ORDER ) ) {
 458+
 459+ $matches = array_slice( $matches, $offset, $length );
 460+ $output = ''; //$end = ($end or ($end >= count($matches)) ? $end : count($matches) );
 461+
 462+ for( $count = 0; $count < count( $matches ); $count++ ) {
 463+ if( $count > 0 ) {
 464+ $output .= $separator;
 465+ }
 466+ $output .= trim( $matches[ $count ][0] );
 467+ }
 468+ return $output;
 469+ }
 470+ return '';
 471+ }
 472+
 473+ /**
 474+ * Returns a value from the last performed regex match
 475+ *
 476+ * @index $parser Parser instance of running Parser
 477+ * @param $index Integer index of the last match which should be returnd or a string containing $n as indexes to be replaced
 478+ * @param $defaultVal Integer default value which will be returned when the result with the given index doesn't exist or is a void string
 479+ */
 480+ public static function pf_regex_var( &$parser, $index = 0, $defaultVal = '' ) {
 481+ // get matches from last #regex
 482+ $lastMatches = self::getLastMatches( $parser );
 483+
 484+ if( $lastMatches === null ) { // last regex was invalid or none executed yet
 485+ return $defaultVal;
 486+ }
 487+
 488+ // if requested index is numerical:
 489+ if (preg_match( '/^\d+$/', $index ) ) {
 490+ // if requested index is in matches and isn't '':
 491+ if( array_key_exists( $index, $lastMatches ) && $lastMatches[$index] !== '' )
 492+ return $lastMatches[ $index ];
 493+ else {
 494+ // no match! Return just the default value:
 495+ return $defaultVal;
 496+ }
 497+ } else {
 498+ // complex string is given, something like "$1, $2 and $3":
 499+
 500+ // limit check, only in complex mode:
 501+ if( self::limitExceeded( $parser ) ) {
 502+ return self::msgLimitExceeded();
 503+ }
 504+ self::increaseRegexCount( $parser );
 505+
 506+ // do the actual transformation:
 507+ return self::regexVarReplace( $index, $lastMatches );
 508+ }
 509+ }
 510+
 511+ /**
 512+ * Replaces all backref variables within a replacement string with the backrefs actual
 513+ * values just like preg_replace would do it.
 514+ */
 515+ private static function regexVarReplace( $replacement, $matches ) {
 516+ /*
 517+ * replace all back-references with their number increased by 1!
 518+ * this way we can also handle $0 in the right way!
 519+ */
 520+ $replacement = preg_replace_callback(
 521+ '%(?<!\\\)(?:\$(?:(\d+)|\{(\d+)\})|\\\(\d+))%',
 522+ array( __CLASS__, 'regexVarReplace_increaseBackrefs_callback' ),
 523+ $replacement
 524+ );
 525+ /*
 526+ * build a helper regex matching all the last matches to use preg_replace
 527+ * which will handle all the replace-escaping handling correct
 528+ */
 529+ $regEx = '';
 530+ foreach( $matches as $match ) {
 531+ $regEx .= '(' . preg_quote( $match, '/' ) . ')';
 532+ }
 533+ $regEx = "/^{$regEx}$/";
 534+
 535+ return preg_replace( $regEx, $replacement, implode( '', $matches ) );
 536+ }
 537+
 538+ /**
 539+ * only used by 'preg_replace_callback' in 'regexVarReplace'
 540+ */
 541+ private static function regexVarReplace_increaseBackrefs_callback( $matches ) {
 542+ // find index:
 543+ $index = false;
 544+ $full = $matches[0];
 545+ for( $i = 1; $index === false || $index === '' ; $i++ ) {
 546+ // $index can be false (shouldn't happen), '' or any number (including 0 !)
 547+ $index = @$matches[ $i ];
 548+ }
 549+ return preg_replace( '%\d+%', (int)$index + 1, $full );
 550+ }
 551+
 552+ /**
 553+ * takes $str and puts a backslash in front of each character that is part of the regular expression syntax
 554+ *
 555+ * @param $parser Parser instance of running Parser
 556+ * @param $str String input string to change
 557+ * @param $delimiter String delimiter which also will be escaped within $str (default is set to '/')
 558+ *
 559+ * @return String Returns the quoted string
 560+ */
 561+ public static function pf_regexquote( &$parser, $str = null, $delimiter = '/' ) {
 562+ if( $str === null ) {
 563+ return '';
 564+ }
 565+ if( $delimiter === '' ) {
 566+ $delimiter = null;
 567+ }
 568+ // do this first! otherwise leading '\' from '\x..' would be doubled!
 569+ $str = preg_quote( $str, $delimiter );
 570+
 571+ /*
 572+ * take care of characters that will mess things up if returned as first ones in a string
 573+ * because they have some special meaning in mediawiki
 574+ */
 575+ $firstChar = substr( $str, 0, 1 );
 576+ switch( $firstChar ) {
 577+ // '*' and ':' is taken care of by preg_quote already
 578+ case '#':
 579+ case ';':
 580+ // e.g. first char as '\x23' in case of '#'
 581+ $str = '\\x' . dechex( ord( $firstChar ) ) . substr( $str, 1 );
 582+ break;
 583+ }
 584+ return $str;
 585+ }
 586+
 587+ public static function onParserLimitReport( $parser, &$report ) {
 588+ global $egRegexFunMaxRegexPerParse;
 589+ $count = self::getLimitCount( $parser );
 590+
 591+ $report .= 'ExtRegexFun count: ';
 592+
 593+ if( $egRegexFunMaxRegexPerParse !== -1 ) {
 594+ $report .= "{$count}/{$egRegexFunMaxRegexPerParse}\n";
 595+ }
 596+ else {
 597+ $report .= "{$count}\n";
 598+ }
 599+ return true;
 600+ }
 601+
 602+
 603+ /***********************************
 604+ **** HELPER - For store of ****
 605+ **** regex stuff within Parser ****
 606+ ***********************************
 607+ ****
 608+ **
 609+ * Adding the info to each Parser object makes it invulnerable to new Parser objects being created
 610+ * and destroyed throughout main parsing process. Only the one parser, 'ParserClearState' is called
 611+ * on will losse its data since the parsing process has been declared finished and the data won't be
 612+ * needed anymore.
 613+ **
 614+ ***/
 615+
 616+ protected static function initLastRegex( Parser &$parser, $pattern, $subject ) {
 617+ self::setLastMatches( $parser, array() );
 618+ self::setLastPattern( $parser, $pattern );
 619+ self::setLastSubject( $parser, $subject );
 620+ }
 621+
 622+ public static function onParserClearState( &$parser ) {
 623+ //cleanup to avoid conflicts with job queue or Special:Import
 624+ $parser->mExtRegexFun = array();
 625+ self::setLastMatches( $parser, null );
 626+ self::setLastPattern( $parser, '' );
 627+ self::setLastSubject( $parser, '' );
 628+ $parser->mExtRegexFun['counter'] = 0;
 629+
 630+ return true;
 631+ }
 632+
 633+ /**
 634+ * Returns whether the maximum limit of regular expression has been exceeded
 635+ * for the given parser objects current Parser::parse() process.
 636+ *
 637+ * @return boolean
 638+ */
 639+ public static function limitExceeded( Parser &$parser ) {
 640+ global $egRegexFunMaxRegexPerParse;
 641+ return (
 642+ $egRegexFunMaxRegexPerParse !== -1
 643+ && $parser->mExtRegexFun['counter'] >= $egRegexFunMaxRegexPerParse
 644+ );
 645+ }
 646+
 647+ public static function getLimitCount( Parser &$parser ) {
 648+ if( isset( $parser->mExtRegexFun['counter'] ) ) {
 649+ return $parser->mExtRegexFun['counter'];
 650+ }
 651+ return 0;
 652+ }
 653+
 654+ private static function increaseRegexCount( Parser &$parser ) {
 655+ $parser->mExtRegexFun['counter']++;
 656+ }
 657+
 658+ /**
 659+ * Returns the last regex matches done by #regex in the context of the same parser object.
 660+ *
 661+ * @param Parser $parser
 662+ * @return array|null
 663+ */
 664+ public static function getLastMatches( Parser &$parser ) {
 665+
 666+ if( isset( $parser->mExtRegexFun['lastMatches'] ) ) {
 667+
 668+ // last matches are set to false in case last regex was in replace mode! Get them on demand:
 669+ if( $parser->mExtRegexFun['lastMatches'] === false ) {
 670+ // first, validate pattern to remove special flags!
 671+ $pattern = self::getLastPattern( $parser );
 672+ self::validateRegex( $pattern );
 673+ preg_match(
 674+ $pattern,
 675+ self::getLastSubject( $parser ),
 676+ $parser->mExtRegexFun['lastMatches']
 677+ );
 678+ }
 679+ return $parser->mExtRegexFun['lastMatches'];
 680+ }
 681+ return null;
 682+ }
 683+ protected static function setLastMatches( Parser &$parser, $value ) {
 684+ $parser->mExtRegexFun['lastMatches'] = $value;
 685+ }
 686+
 687+ public static function getLastPattern( Parser &$parser ) {
 688+ if( isset( $parser->mExtRegexFun['lastPattern'] ) ) {
 689+ return $parser->mExtRegexFun['lastPattern'];
 690+ }
 691+ return '';
 692+ }
 693+ protected static function setLastPattern( Parser &$parser, $value ) {
 694+ $parser->mExtRegexFun['lastPattern'] = $value;
 695+ }
 696+
 697+ public static function getLastSubject( Parser &$parser ) {
 698+ if( isset( $parser->mExtRegexFun['lastSubject'] ) ) {
 699+ return $parser->mExtRegexFun['lastSubject'];
 700+ }
 701+ return '';
 702+ }
 703+ protected static function setLastSubject( Parser &$parser, $value ) {
 704+ $parser->mExtRegexFun['lastSubject'] = $value;
 705+ }
 706+
 707+}
\ No newline at end of file
Property changes on: tags/extensions/RegexFun/REL_1_0_2/RegexFun.php
___________________________________________________________________
Added: svn:eol-style
1708 + native
Index: tags/extensions/RegexFun/REL_1_0_2/RegexFun_Settings.php
@@ -0,0 +1,44 @@
 2+<?php
 3+
 4+/**
 5+ * File defining the settings for the 'Regex Fun' extension.
 6+ * More info can be found at http://www.mediawiki.org/wiki/Extension:Regex_Fun#Configuration
 7+ *
 8+ * NOTICE:
 9+ * =======
 10+ * Changing one of these settings can be done by copying and placing
 11+ * it in LocalSettings.php, AFTER the inclusion of 'Regex Fun'.
 12+ *
 13+ * @file RegexFun_Settings.php
 14+ * @ingroup RegexFun
 15+ * @since 1.0.1
 16+ *
 17+ * @author Daniel Werner
 18+ */
 19+
 20+/**
 21+ * Allows to define functions which should not be available within the wiki.
 22+ *
 23+ * @example
 24+ * # disable '#regexall' and '#regex_var' functions:
 25+ * $egRegexFunDisabledFunctions = array( 'regexall', 'regex_var' );
 26+ *
 27+ * @since 1.0.1
 28+ * @var array
 29+ */
 30+$egRegexFunDisabledFunctions = array();
 31+
 32+
 33+/**
 34+ * Defines the maximum regular expression executions per parser process. This
 35+ * counts all executed regular expression usages by this extension. The counter
 36+ * will be increased by '#regex', '#regexall' and '#regex_var' if a reference
 37+ * string is given but not if only a index is requested. '#regexquote' is not
 38+ * affected. When the limit is exceeded, a '#iferror' catchable error message
 39+ * will be put out instead of the result of the function.
 40+ * The limit can be set to -1 to disable the limit (default).
 41+ *
 42+ * @since 1.0.1
 43+ * @var integer
 44+ */
 45+$egRegexFunMaxRegexPerParse = -1;
Property changes on: tags/extensions/RegexFun/REL_1_0_2/RegexFun_Settings.php
___________________________________________________________________
Added: svn:eol-style
146 + native
Index: tags/extensions/RegexFun/REL_1_0_2/README
@@ -0,0 +1,40 @@
 2+== About ==
 3+
 4+''Regex Fun'' is a MediaWiki extension by Daniel Werner which adds parser functions for performing regular expression
 5+searches and replacements.
 6+The '#regex' parser function is inspired by 'RegexParserFunctions' extension from Jim R. Wilson and mostly compatible
 7+with it. 'RegexParserFunctions' simply is outdated and lacks some advanced functionality provided by this extension.
 8+
 9+''Regex Fun'' defines the following parser functions within your wiki:
 10+
 11+ - #regex: Search or replace with help of php preg regular expression. Returns first match in search mode.
 12+ Use of the 'e' modifier behind the expression will be detected, the effect of using 'e' now is
 13+ adapted for mediawiki. With 'e' the replacement string will be parsed after references are replaced.
 14+ - #regexall: Searches the whole string for as many matches as possible and returns them separated by a separator.
 15+ - #regex_var: Allows to access subexpression references of the last used 'regex' function.
 16+ - #regexquote: Runs php function 'preg_quote' on a string to use user-input savelly in regex functions. In case the
 17+ first character is a character with special meaning in MW, it will be replaced with its hexadecimal
 18+ notation e.g. '\x23' instead of '#'. This will prevent from things going terribly wrong when using
 19+ user input within a regular expression.
 20+
 21+* Website: http://www.mediawiki.org/wiki/Extension:Regex_Fun
 22+* License: ISC license
 23+* Author: Daniel Werner < danweetz@web.de >
 24+
 25+
 26+== Installation ==
 27+
 28+Once you have downloaded the code, place the 'RegexFun' directory within your
 29+MediaWiki 'extensions' directory. Then add the following code to your
 30+[[Manual:LocalSettings.php|LocalSettings.php]] file:
 31+
 32+ # Regex Fun
 33+ require_once( "$IP/extensions/RegexFun/RegexFun.php" );
 34+
 35+
 36+== Contributing ==
 37+
 38+If you have bug reports or requests, please add them to the ''Regex Fun'' Talk page [0].
 39+You can also send them to Daniel Werner < danweetz@web.de >
 40+
 41+[0] http://www.mediawiki.org/w/index.php?title=Extension_talk:Regex_Fun
\ No newline at end of file
Property changes on: tags/extensions/RegexFun/REL_1_0_2/README
___________________________________________________________________
Added: svn:eol-style
142 + native

Status & tagging log