r60869 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r60868‎ | r60869 | r60870 >
Date:18:43, 9 January 2010
Author:siebrand
Status:ok (Comments)
Tags:
Comment:
(bug 21387) Make $ regex work for the URLs. Patch contributed by Platonides.

Bug comment: Set PCRE_MULTILINE on spamblacklist regexes. $ on spam blacklist regex should match the end of the url (not of the text) so it can be used to match only the mainpage. Since the candidate urls are already joined with a new-line separator, it's just setting PCRE_MULTILINE on the regex.
Modified paths:
  • /trunk/extensions/SpamBlacklist/SpamBlacklist_body.php (modified) (history)

Diff [purge]

Index: trunk/extensions/SpamBlacklist/SpamBlacklist_body.php
@@ -386,10 +386,10 @@
387387 # Make regex
388388 # It's faster using the S modifier even though it will usually only be run once
389389 //$regex = 'https?://+[a-z0-9_\-.]*(' . implode( '|', $lines ) . ')';
390 - //return '/' . str_replace( '/', '\/', preg_replace('|\\\*/|', '/', $regex) ) . '/Si';
 390+ //return '/' . str_replace( '/', '\/', preg_replace('|\\\*/|', '/', $regex) ) . '/Sim';
391391 $regexes = array();
392392 $regexStart = '/https?:\/\/+[a-z0-9_\-.]*(';
393 - $regexEnd = ($batchSize > 0 ) ? ')/Si' : ')/i';
 393+ $regexEnd = ($batchSize > 0 ) ? ')/Sim' : ')/im';
394394 $build = false;
395395 foreach( $lines as $line ) {
396396 if( substr( $line, -1, 1 ) == "\\" ) {

Comments

#Comment by Lustiger seth (talk | contribs)   10:40, 31 January 2010

Hi! I guess this is useless. The built regexp patterns are used on _extracted_ links, see line 226: $newLinks = array_keys( $out->getExternalLinks() );

So there is only an array with (n times) one line to be matched, so the m-modifier is without sense here.

#Comment by Lustiger seth (talk | contribs)   10:45, 31 January 2010

oops, sorry, I forgot 237: $links = implode( "\n", $addedLinks ); So forget about my thoughts, please. :-)

Status & tagging log