r93180 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r93179‎ | r93180 | r93181 >
Date:14:22, 26 July 2011
Author:yaron
Status:deferred
Tags:
Comment:
Changed DB retrieval code to detect whether DB data is in UTF-8 encoding already, instead of assuming that it's ISO-8859-1 and always converting it
Modified paths:
  • /trunk/extensions/ExternalData/ED_Utils.php (modified) (history)

Diff [purge]

Index: trunk/extensions/ExternalData/ED_Utils.php
@@ -258,10 +258,15 @@
259259 // doesn't get chopped off to just "b").
260260 $new_row = array();
261261 foreach ( $vars as $i => $column_name ) {
262 - // Data that comes from the DB will
263 - // (always?) be in ISO-8859-1 format -
264 - // convert it to UTF8.
265 - $new_row[$column_name] = utf8_encode( $row[$i] );
 262+ // Convert the encoding to UTF-8
 263+ // if necessary - based on code at
 264+ // http://www.php.net/manual/en/function.mb-detect-encoding.php#102510
 265+ $dbField = $row[$i];
 266+ if ( mb_detect_encoding( $dbField, 'UTF-8', true ) == 'UTF-8' ) {
 267+ $new_row[$column_name] = $dbField;
 268+ } else {
 269+ $new_row[$column_name] = utf8_encode( $dbField );
 270+ }
266271 }
267272 $rows[] = $new_row;
268273 }
@@ -293,8 +298,9 @@
294299 // regular expression copied from http://us.php.net/fgetcsv
295300 $vals = preg_split( '/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/', $csv_line );
296301 $vals2 = array();
297 - foreach ( $vals as $val )
 302+ foreach ( $vals as $val ) {
298303 $vals2[] = trim( $val, '"' );
 304+ }
299305 return $vals2;
300306 }
301307

Follow-up revisions

RevisionCommit summaryAuthorDate
r102326Follow-up to r93180 - fixed handling if mb_detect_encoding() function does no...yaron21:10, 7 November 2011

Status & tagging log