r55768 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r55767‎ | r55768 | r55769 >
Date:14:19, 3 September 2009
Author:thomasv
Status:resolved (Comments)
Tags:
Comment:
get rid of invalid UTF8, strip control characters
Modified paths:
  • /trunk/phase3/includes/DjVuImage.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/DjVuImage.php
@@ -250,6 +250,9 @@
251251 $txt = wfShellExec( $cmd, $retval );
252252 wfProfileOut( 'djvutxt' );
253253 if( $retval == 0) {
 254+ # Get rid of invalid UTF-8, strip control characters
 255+ $txt = iconv( "UTF-8","UTF-8//IGNORE", $txt );
 256+ $txt = preg_replace( "/[\013\035\037]/", "", $txt );
254257 $txt = htmlspecialchars($txt);
255258 $txt = preg_replace( "/\((page\s[\d-]*\s[\d-]*\s[\d-]*\s[\d-]*\s*\&quot;([^<]*?)\&quot;\s*|)\)/s", "<PAGE value=\"$2\" />", $txt );
256259 $txt = "<DjVuTxt>\n<HEAD></HEAD>\n<BODY>\n" . $txt . "</BODY>\n</DjVuTxt>\n";

Follow-up revisions

RevisionCommit summaryAuthorDate
r55817follow-up to r55768thomasv13:08, 4 September 2009
r55843use UtfNormal::cleanUp() (r55768)thomasv06:45, 5 September 2009
r61258if available, use iconv because it requires less memory (fixes bug 21809) (fo...thomasv17:23, 19 January 2010

Comments

#Comment by Nikerabbit (talk | contribs)   18:12, 3 September 2009

Probably needs error suppression, compare with r52830.

#Comment by ThomasV (talk | contribs)   13:09, 4 September 2009

I trust you on that

#Comment by Brion VIBBER (talk | contribs)   23:11, 4 September 2009

Not sure how useful this'll be if native iconv isn't available. Probably you want to use UtfNormal::cleanUp() for this?

Status & tagging log