r82927 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r82926‎ | r82927 | r82928 >
Date:03:15, 28 February 2011
Author:tstarling
Status:resolved (Comments)
Tags:
Comment:
Followup for r81340:
* Allow any language code which consists entirely of valid title characters, and does not contain any path-traversal characters, to be customised via the uselang parameter. Language::isValidCode() represents this concept.
* Add some shortcuts preventing Language and LocalisationCache from looking for localisation files for a language code which does not follow the usual form of language codes in MediaWiki, i.e. /[a-z-]*/. This concept is represented by Language::isValidBuiltInCode().
* Do not allow colon characters in file names, per Platonides' suggestion on CR.
Modified paths:
  • /trunk/phase3/includes/LocalisationCache.php (modified) (history)
  • /trunk/phase3/languages/Language.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/LocalisationCache.php
@@ -343,6 +343,12 @@
344344 }
345345 $this->initialisedLangs[$code] = true;
346346
 347+ # If the code is of the wrong form for a Messages*.php file, do a shallow fallback
 348+ if ( !Language::isValidBuiltInCode( $code ) ) {
 349+ $this->initShallowFallback( $code, 'en' );
 350+ return;
 351+ }
 352+
347353 # Recache the data if necessary
348354 if ( !$this->manualRecache && $this->isExpired( $code ) ) {
349355 if ( file_exists( Language::getMessagesFileName( $code ) ) ) {
Index: trunk/phase3/languages/Language.php
@@ -157,11 +157,19 @@
158158
159159 // Protect against path traversal below
160160 if ( !Language::isValidCode( $code )
161 - || strcspn( $code, "/\\\000" ) !== strlen( $code ) )
 161+ || strcspn( $code, ":/\\\000" ) !== strlen( $code ) )
162162 {
163163 throw new MWException( "Invalid language code \"$code\"" );
164164 }
165165
 166+ if ( !Language::isValidBuiltInCode( $code ) ) {
 167+ // It's not possible to customise this code with class files, so
 168+ // just return a Language object. This is to support uselang= hacks.
 169+ $lang = new Language;
 170+ $lang->setCode( $code );
 171+ return $lang;
 172+ }
 173+
166174 if ( $code == 'en' ) {
167175 $class = 'Language';
168176 } else {
@@ -193,13 +201,24 @@
194202
195203 /**
196204 * Returns true if a language code string is of a valid form, whether or
197 - * not it exists.
 205+ * not it exists. This includes codes which are used solely for
 206+ * customisation via the MediaWiki namespace.
198207 */
199208 public static function isValidCode( $code ) {
200 - return strcspn( $code, "/\\\000" ) === strlen( $code );
 209+ return
 210+ strcspn( $code, ":/\\\000" ) === strlen( $code )
 211+ && !preg_match( Title::getTitleInvalidRegex(), $code );
201212 }
202213
203214 /**
 215+ * Returns true if a language code is of a valid form for the purposes of
 216+ * internal customisation of MediaWiki, via Messages*.php.
 217+ */
 218+ public static function isValidBuiltInCode( $code ) {
 219+ return preg_match( '/^[a-z0-9-]*$/', $code );
 220+ }
 221+
 222+ /**
204223 * Get the LocalisationCache instance
205224 *
206225 * @return LocalisationCache
@@ -2859,7 +2878,7 @@
28602879 static function getFileName( $prefix = 'Language', $code, $suffix = '.php' ) {
28612880 // Protect against path traversal
28622881 if ( !Language::isValidCode( $code )
2863 - || strcspn( $code, "/\\\000" ) !== strlen( $code ) )
 2882+ || strcspn( $code, ":/\\\000" ) !== strlen( $code ) )
28642883 {
28652884 throw new MWException( "Invalid language code \"$code\"" );
28662885 }

Sign-offs

UserFlagDate
Hasharinspected18:27, 3 March 2011
Hashartested19:15, 3 March 2011

Follow-up revisions

RevisionCommit summaryAuthorDate
r83160Fix language code validation (from r82927)...hashar19:13, 3 March 2011
r83223Underscore are not valid in language code...hashar17:16, 4 March 2011
r95646In Language::isValidBuiltInCode(), reject the empty string per Nikerabbit's r...tstarling04:01, 29 August 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r81340Relax Language::isValidCode() to avoid breaking uselang hackststarling22:54, 1 February 2011

Comments

#Comment by Hashar (talk | contribs)   18:24, 3 March 2011

Using $wgLanguageCode = "fr"; this revision break (most probably fix) the following tests:

  • LanguageBeTaraskTest::testSearchRightSingleQuotationMarkAsApostroph
  • LanguageBeTaraskTest::testDoesNotCommafyFourDigitsNumber
3) LanguageBeTaraskTest::testSearchRightSingleQuotationMarkAsApostroph
[https://bugzilla.wikimedia.org/show_bug.cgi?id=23156 bug 23156]: U+2019 conversion to U+0027
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'
+’

/var/www/mediawiki-trunk/tests/phpunit/languages/LanguageBe_taraskTest.php:20
/var/www/mediawiki-trunk/tests/phpunit/MediaWikiTestCase.php:61
/var/www/mediawiki-trunk/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/var/www/mediawiki-trunk/tests/phpunit/phpunit.php:61

4) LanguageBeTaraskTest::testDoesNotCommafyFourDigitsNumber
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-1234
+1,234

/var/www/mediawiki-trunk/tests/phpunit/languages/LanguageBe_taraskTest.php:29
/var/www/mediawiki-trunk/tests/phpunit/MediaWikiTestCase.php:61
/var/www/mediawiki-trunk/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/var/www/mediawiki-trunk/tests/phpunit/phpunit.php:61
#Comment by Hashar (talk | contribs)   18:27, 3 March 2011

Although the tests use an object build with:

Language::factory( 'Be_tarask' );
#Comment by Hashar (talk | contribs)   19:14, 3 March 2011

Should be fixed with r83160 which allow underscore and uppercase characters. Tests included :)

#Comment by Nikerabbit (talk | contribs)   20:47, 24 August 2011

Why does isValidBuiltInCode accept empty string?

#Comment by Hashar (talk | contribs)   09:57, 27 August 2011

Marked fixme per your comment. The regex probably need to use + instead of *.

#Comment by Tim Starling (talk | contribs)   04:02, 29 August 2011

Marking new after r95646.

Status & tagging log