r84057 MediaWiki - Code Review archive

Repository:	MediaWiki
Revision:	< r84056‎ \| r84057 \| r84058 >
Date:	21:56, 15 March 2011
Author:	hashar
Status:	reverted (Comments)
Tags:
Comment:	bug 28040 Turkish: properly handle dotted and dotless i As mentioned by Bawolff on code review, r83970 only handled case change of the first character lacking full strings support. This patch override the uc and lc methods for the Turkish language (tr) using preg_replace() which know about unicode. Other possible choices would have been: - strtr() => outputs garbage - mbstring => can not know we handle turkish and transform i to I! I have amended the RELEASE-NOTES to reflect this patch. Some new tests are added as well to cover the regular functions as well as the specific Turkish overriding. Result in testdox: LanguageTr [x] Change case of first char being dotted and dotless i [x] Language tr lower casing override [x] Language tr upper casing override [x] Upper casing of a string with dotted and dot less i [x] Lower casing of a string with dotted and dot less i
Modified paths:	/trunk/phase3/RELEASE-NOTES (modified) (history) /trunk/phase3/languages/classes/LanguageTr.php (modified) (history) /trunk/phase3/tests/phpunit/languages/LanguageTrTest.php (modified) (history)

Diff [purge]

Index: trunk/phase3/tests/phpunit/languages/LanguageTrTest.php
—	—	@@ -24,7 +24,7 @@
25	25	* @see http://en.wikipedia.org/wiki/Dotted_and_dotless_I
26	26	* @dataProvider provideDottedAndDotlessI
27	27	*/
28		~~- function testDottedAndDotlessI( $func, $input, $inputCase, $expected ) {~~
	28	+ function testChangeCaseOfFirstCharBeingDottedAndDotlessI( $func, $input, $inputCase, $expected ) {
29	29	if( $func == 'ucfirst' ) {
30	30	$res = $this->lang->ucfirst( $input );
31	31	} elseif( $func == 'lcfirst' ) {
—	—	@@ -62,6 +62,60 @@
63	63	array( 'lcfirst', 'IPhone', 'upper', 'ıPhone' ),
64	64
65	65	);
	66	+ }
	67	+
	68	+##### LanguageTr specificities #############################################
	69	+ /**
	70	+ * @cover LanguageTr:lc
	71	+ * See @bug 28040
	72	+ */
	73	+ function testLanguageTrLowerCasingOverride() {
	74	+ $this->assertEquals( 'ııııı', $this->lang->lc( 'IIIII') );
66	75	}
	76	+ /**
	77	+ * @cover LanguageTr:uc
	78	+ * See @bug 28040
	79	+ */
	80	+ function testLanguageTrUpperCasingOverride() {
	81	+ $this->assertEquals( 'İİİİİ', $this->lang->uc( 'iiiii') );
	82	+ }
67	83
	84	+##### Upper casing a string #################################################
	85	+ /**
	86	+ * Generic test for the Turkish dotted and dotless I strings
	87	+ * See @bug 28040
	88	+ * @dataProvider provideUppercaseStringsWithDottedAndDotlessI
	89	+ */
	90	+ function testUpperCasingOfAStringWithDottedAndDotLessI( $expected, $input ) {
	91	+ $this->assertEquals( $expected, $this->lang->uc( $input ) );
	92	+ }
	93	+ function provideUppercaseStringsWithDottedAndDotlessI() {
	94	+ return array(
	95	+ # expected, input string to uc()
	96	+ array( 'IIIII', 'ııııı' ),
	97	+ array( 'IIIII', 'IIIII' ), #identity
	98	+ array( 'İİİİİ', 'iiiii' ), # Specifically handled by LanguageTr:uc
	99	+ array( 'İİİİİ', 'İİİİİ' ), #identity
	100	+ );
	101	+ }
	102	+
	103	+##### Lower casing a string #################################################
	104	+ /**
	105	+ * Generic test for the Turkish dotted and dotless I strings
	106	+ * See @bug 28040
	107	+ * @dataProvider provideLowercaseStringsWithDottedAndDotlessI
	108	+ */
	109	+ function testLowerCasingOfAStringWithDottedAndDotLessI( $expected, $input ) {
	110	+ $this->assertEquals( $expected, $this->lang->lc( $input ) );
	111	+ }
	112	+ function provideLowercaseStringsWithDottedAndDotlessI() {
	113	+ return array(
	114	+ # expected, input string to lc()
	115	+ array( 'ııııı', 'IIIII' ), # Specifically handled by LanguageTr:lc
	116	+ array( 'ııııı', 'ııııı' ), #identity
	117	+ array( 'iiiii', 'İİİİİ' ),
	118	+ array( 'iiiii', 'iiiii' ), #identity
	119	+ );
	120	+ }
	121	+
68	122	}
Index: trunk/phase3/languages/classes/LanguageTr.php
—	—	@@ -28,4 +28,16 @@
29	29	}
30	30	}
31	31
	32	+ /** @see bug 28040 */
	33	+ function uc( $string ) {
	34	+ $string = preg_replace( '/i/', 'İ', $string );
	35	+ return parent::uc( $string );
	36	+ }
	37	+
	38	+ /** @see bug 28040 */
	39	+ function lc( $string ) {
	40	+ $string = preg_replace( '/I/', 'ı', $string );
	41	+ return parent::lc( $string );
	42	+ }
	43	+
32	44	}
Index: trunk/phase3/RELEASE-NOTES
—	—	@@ -277,7 +277,8 @@
278	278	* (bug 27681) Set $namespaceGenderAliases for Portuguese (pt and pt-br)
279	279	* (bug 27785) Fallback language for Kabardian (kbd) is English now.
280	280	* (bug 27825) Raw watchlist edit message now uses formatted numbers.
281		~~-* (bug 28040) Turkish: properly lower case 'I' to 'ı' (dotless i)~~
	281	+* (bug 28040) Turkish: properly lower case 'I' to 'ı' (dotless i) and
	282	+ uppercase 'i' to 'İ' (dotted i)
282	283
283	284	== Compatibility ==
284	285

Follow-up revisions

Revision	Commit summary	Author	Date
r84080	Makes LanguageTr uc & lc match parent declaration...	hashar	07:38, 16 March 2011
r99074	Fixes for r84057 LanguageTr uc/lc:...	tstarling	02:31, 6 October 2011
r99246	Tests for bug 31490 : turkish magic word with a 'i' are broken :d...	hashar	20:18, 7 October 2011
r99290	Revert r84057, r84080, part of r99074: lc() and uc() custom handling for Turk...	brion	00:30, 8 October 2011

Past revisions this follows-up on

Revision	Commit summary	Author	Date
r83970	bug 28040 Turkish: properly lower case 'I' to 'ı' (dotless i)...	hashar	22:14, 14 March 2011

Comments

#Comment by Hashar (talk | contribs) 21:58, 15 March 2011

bug 28040 amended and still open.

#Comment by Raymond (talk | contribs) 07:37, 16 March 2011

Seen on Translatewiki:

PHP Strict Standards:  Declaration of LanguageTr::lc() should be compatible with that of Language::lc() in /www/w/languages/classes/LanguageTr.php on line 43
PHP Strict Standards:  Declaration of LanguageTr::uc() should be compatible with that of Language::uc() in /www/w/languages/Language.php on line 182

#Comment by Hashar (talk | contribs) 09:02, 16 March 2011

Should be taken care of with follow up r84080. One of my computers got an incorrect PHP configuration and those errors were not shown :(

#Comment by MaxSem (talk | contribs) 18:20, 7 October 2011

Causes bug 31490 - lcfirst and ucfirst parser functions do not work

#Comment by Brion VIBBER (talk | contribs) 18:29, 7 October 2011

Specifically it looks like it breaks case-insensitive matching of magic words that contain the letter 'i' or 'I'.

So {{ucfirst:x}} doesn't match with the 'ucfirst' keyword anymore, whereas 'UCFIRST' or 'ucfırst' do.

#Comment by MaxSem (talk | contribs) 19:24, 7 October 2011

We could normalize magic words by passing them through lc() and then uc() in LocalisationCache, but I don't know what else this revision could potentially break.

#Comment by Hashar (talk | contribs) 20:30, 7 October 2011

We should probably revert this revision on live wikis pending a proper fix.

#Comment by Szoszv (talk | contribs) 19:39, 7 October 2011

so, do not work special pages (özel sayfalar). http://tr.wikipedia.org/wiki/Special:SpecialPages or http://tr.wikipedia.org/wiki/%C3%96zel:%C3%96zelSayfalar

Status & tagging log

00:41, 8 October 2011 Brion VIBBER (talk | contribs) changed the status of r84057 [removed: fixme added: reverted]
18:20, 7 October 2011 MaxSem (talk | contribs) changed the status of r84057 [removed: resolved added: fixme]
10:57, 27 July 2011 Hashar (talk | contribs) changed the tags for r84057 [removed: patchset-turkish]
21:09, 30 June 2011 Aaron Schulz (talk | contribs) changed the status of r84057 [removed: new added: resolved]
10:07, 17 March 2011 Hashar (talk | contribs) changed the tags for r84057 [added: patchset-turkish]
09:04, 16 March 2011 Raymond (talk | contribs) changed the status of r84057 [removed: fixme added: new]
07:37, 16 March 2011 Raymond (talk | contribs) changed the status of r84057 [removed: new added: fixme]