r94995 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r94994‎ | r94995 | r94996 >
Date:11:23, 19 August 2011
Author:catrope
Status:ok (Comments)
Tags:config, scaptrap 
Comment:
Per CR on r44412 and my promise in the commit summary of r94990, stop abusing $wgInternalServer (intended for Squid URLs) for IRC/e-mail URLs and introduce $wgCanonicalServer for these purposes instead. This revision introduces two new hooks for WMF hacks, in exchange for making the core code saner.

* Introduce $wgCanonicalServer, which should typically be a fully qualified version of $wgServer but in practice can be anything that you'd like to be used in IRC/e-mail notifs
** Default value is $wgServer, expanded to http:// if protocol-relative
** This means you can easily set HTTPS as the 'default' protocol to use in IRC and e-mail notifs by setting $wgCanonicalServer to https://example.com
* Introduce Title::getCanonicalURL(). Similar to getInternalURL(), including a hook for WMF usage (which will be needed as long as secure.wikimedia.org is used)
** Also add escapeCanonicalURL(). Due to some ridiculous accident of history, the other escapeFooURL() functions don't have a $variant parameter; I decided not to follow that bad example
* Reinstate the spirit of r44406 and r44412: instead of calling getInternalURL() (or getCanonicalURL()) and regexing the title parameter out, obtain the path to index.php using $wgCanonicalServer . $wgScript and append params to that. Sadly, we need to add a hook here to support the secure server hack for WMF, but that's the price of saner code in this case
* Introduce the {{canonicalurl:}} and {{canonicalurle:}} parser functions, which work just like {{fullurl:}} and {{fullurle:}} except that they use getCanonicalURL() instead of getFullURL()
* Use {{canonicalurl:}} in the enotif_body message, fixing bug 29993 (protocol-relative URLs appear in e-mail notifications)
Modified paths:
  • /trunk/phase3/includes/DefaultSettings.php (modified) (history)
  • /trunk/phase3/includes/RecentChange.php (modified) (history)
  • /trunk/phase3/includes/Setup.php (modified) (history)
  • /trunk/phase3/includes/Title.php (modified) (history)
  • /trunk/phase3/includes/parser/CoreParserFunctions.php (modified) (history)
  • /trunk/phase3/languages/messages/MessagesEn.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/RecentChange.php
@@ -682,7 +682,8 @@
683683 }
684684
685685 public function getIRCLine() {
686 - global $wgUseRCPatrol, $wgUseNPPatrol, $wgRC2UDPInterwikiPrefix, $wgLocalInterwiki;
 686+ global $wgUseRCPatrol, $wgUseNPPatrol, $wgRC2UDPInterwikiPrefix, $wgLocalInterwiki,
 687+ $wgCanonicalServer, $wgScript;
687688
688689 if( $this->mAttribs['rc_type'] == RC_LOG ) {
689690 $titleObj = Title::newFromText( 'Log/' . $this->mAttribs['rc_log_type'], NS_SPECIAL );
@@ -695,19 +696,18 @@
696697 if( $this->mAttribs['rc_type'] == RC_LOG ) {
697698 $url = '';
698699 } else {
 700+ $url = $wgCanonicalServer . $wgScript;
699701 if( $this->mAttribs['rc_type'] == RC_NEW ) {
700 - $url = 'oldid=' . $this->mAttribs['rc_this_oldid'];
 702+ $query = '?oldid=' . $this->mAttribs['rc_this_oldid'];
701703 } else {
702 - $url = 'diff=' . $this->mAttribs['rc_this_oldid'] . '&oldid=' . $this->mAttribs['rc_last_oldid'];
 704+ $query = '?diff=' . $this->mAttribs['rc_this_oldid'] . '&oldid=' . $this->mAttribs['rc_last_oldid'];
703705 }
704706 if ( $wgUseRCPatrol || ( $this->mAttribs['rc_type'] == RC_NEW && $wgUseNPPatrol ) ) {
705 - $url .= '&rcid=' . $this->mAttribs['rc_id'];
 707+ $query .= '&rcid=' . $this->mAttribs['rc_id'];
706708 }
707 - // XXX: *HACK* this should use getFullURL(), hacked for SSL madness --brion 2005-12-26
708 - // XXX: *HACK^2* the preg_replace() undoes much of what getInternalURL() does, but we
709 - // XXX: need to call it so that URL paths on the Wikimedia secure server can be fixed
710 - // XXX: by a custom GetInternalURL hook --vyznev 2008-12-10
711 - $url = preg_replace( '/title=[^&]*&/', '', $titleObj->getInternalURL( $url ) );
 709+ // HACK: We need this hook for WMF's secure server setup
 710+ wfRunHooks( 'IRCLineURL', array( &$url, &$query ) );
 711+ $url .= $query;
712712 }
713713
714714 if( isset( $this->mExtra['oldSize'] ) && isset( $this->mExtra['newSize'] ) ) {
Index: trunk/phase3/includes/parser/CoreParserFunctions.php
@@ -35,6 +35,8 @@
3636 $parser->setFunctionHook( 'localurle', array( __CLASS__, 'localurle' ), SFH_NO_HASH );
3737 $parser->setFunctionHook( 'fullurl', array( __CLASS__, 'fullurl' ), SFH_NO_HASH );
3838 $parser->setFunctionHook( 'fullurle', array( __CLASS__, 'fullurle' ), SFH_NO_HASH );
 39+ $parser->setFunctionHook( 'canonicalurl', array( __CLASS__, 'canonicalurl' ), SFH_NO_HASH );
 40+ $parser->setFunctionHook( 'canonicalurle', array( __CLASS__, 'canonicalurle' ), SFH_NO_HASH );
3941 $parser->setFunctionHook( 'formatnum', array( __CLASS__, 'formatnum' ), SFH_NO_HASH );
4042 $parser->setFunctionHook( 'grammar', array( __CLASS__, 'grammar' ), SFH_NO_HASH );
4143 $parser->setFunctionHook( 'gender', array( __CLASS__, 'gender' ), SFH_NO_HASH );
@@ -218,6 +220,8 @@
219221 static function localurle( $parser, $s = '', $arg = null ) { return self::urlFunction( 'escapeLocalURL', $s, $arg ); }
220222 static function fullurl( $parser, $s = '', $arg = null ) { return self::urlFunction( 'getFullURL', $s, $arg ); }
221223 static function fullurle( $parser, $s = '', $arg = null ) { return self::urlFunction( 'escapeFullURL', $s, $arg ); }
 224+ static function canonicalurl( $parser, $s = '', $arg = null ) { return self::urlFunction( 'getCanonicalURL', $s, $arg ); }
 225+ static function canonicalurle( $parser, $s = '', $arg = null ) { return self::urlFunction( 'escapeCanonicalURL', $s, $arg ); }
222226
223227 static function urlFunction( $func, $s = '', $arg = null ) {
224228 $title = Title::newFromText( $s );
Index: trunk/phase3/includes/Setup.php
@@ -343,6 +343,11 @@
344344 wfProfileOut( $fname . '-includes' );
345345 }
346346
 347+# Now that GlobalFunctions is loaded, set the default for $wgCanonicalServer
 348+if ( $wgCanonicalServer === false ) {
 349+ $wgCanonicalServer = wfExpandUrl( $wgServer, PROTO_HTTP );
 350+}
 351+
347352 wfProfileIn( $fname . '-misc1' );
348353
349354 # Raise the memory limit if it's too low
Index: trunk/phase3/includes/Title.php
@@ -987,6 +987,13 @@
988988 public function escapeFullURL( $query = '' ) {
989989 return htmlspecialchars( $this->getFullURL( $query ) );
990990 }
 991+
 992+ /**
 993+ * HTML-escaped version of getCanonicalURL()
 994+ */
 995+ public function escapeCanonicalURL( $query = '', $variant = false ) {
 996+ return htmlspecialchars( $this->getCanonicalURL( $query, $variant ) );
 997+ }
991998
992999 /**
9931000 * Get the URL form for an internal link.
@@ -1010,6 +1017,22 @@
10111018 }
10121019
10131020 /**
 1021+ * Get the URL for a canonical link, for use in things like IRC and
 1022+ * e-mail notifications. Uses $wgCanonicalServer and the
 1023+ * GetCanonicalURL hook.
 1024+ *
 1025+ * @param $query string An optional query string
 1026+ * @param $variant string Language variant of URL (for sr, zh, ...)
 1027+ * @return string The URL
 1028+ */
 1029+ public function getCanonicalURL( $query = '', $variant = false ) {
 1030+ global $wgCanonicalServer;
 1031+ $url = $wgCanonicalServer . $this->getLocalURL( $query, $variant );
 1032+ wfRunHooks( 'GetCanonicalURL', array( &$this, &$url, $query ) );
 1033+ return $url;
 1034+ }
 1035+
 1036+ /**
10141037 * Get the edit URL for this Title
10151038 *
10161039 * @return String the URL, or a null string if this is an
Index: trunk/phase3/includes/DefaultSettings.php
@@ -39,19 +39,31 @@
4040 $wgSitename = 'MediaWiki';
4141
4242 /**
43 - * URL of the server. It will be automatically built including https mode.
 43+ * URL of the server.
4444 *
4545 * Example:
4646 * <code>
47 - * $wgServer = http://example.com
 47+ * $wgServer = 'http://example.com';
4848 * </code>
4949 *
5050 * This is usually detected correctly by MediaWiki. If MediaWiki detects the
5151 * wrong server, it will redirect incorrectly after you save a page. In that
5252 * case, set this variable to fix it.
 53+ *
 54+ * If you want to use protocol-relative URLs on your wiki, set this to a
 55+ * protocol-relative URL like '//example.com' and set $wgCanonicalServer
 56+ * to a fully qualified URL.
5357 */
5458 $wgServer = WebRequest::detectServer();
5559
 60+/**
 61+ * Canonical URL of the server, to use in IRC feeds and notification e-mails.
 62+ * Must be fully qualified, even if $wgServer is protocol-relative.
 63+ *
 64+ * Defaults to $wgServer, expanded to a fully qualified http:// URL if needed.
 65+ */
 66+$wgCanonicalServer = false;
 67+
5668 /************************************************************************//**
5769 * @name Script path settings
5870 * @{
@@ -120,6 +132,7 @@
121133 */
122134 $wgLoadScript = false;
123135
 136+
124137 /**@}*/
125138
126139 /************************************************************************//**
Index: trunk/phase3/languages/messages/MessagesEn.php
@@ -310,6 +310,8 @@
311311 'plural' => array( 0, 'PLURAL:' ),
312312 'fullurl' => array( 0, 'FULLURL:' ),
313313 'fullurle' => array( 0, 'FULLURLE:' ),
 314+ 'canonicalurl' => array( 0, 'CANONICALURL:' ),
 315+ 'canonicalurle' => array( 0, 'CANONICALURLE:' ),
314316 'lcfirst' => array( 0, 'LCFIRST:' ),
315317 'ucfirst' => array( 0, 'UCFIRST:' ),
316318 'lc' => array( 0, 'LC:' ),
@@ -2772,16 +2774,16 @@
27732775
27742776 --
27752777 To change your email notification settings, visit
2776 -{{fullurl:{{#special:Preferences}}}}
 2778+{{canonicalurl:{{#special:Preferences}}}}
27772779
27782780 To change your watchlist settings, visit
2779 -{{fullurl:{{#special:EditWatchlist}}}}
 2781+{{canonicalurl:{{#special:EditWatchlist}}}}
27802782
27812783 To delete the page from your watchlist, visit
27822784 $UNWATCHURL
27832785
27842786 Feedback and further assistance:
2785 -{{fullurl:{{MediaWiki:Helppage}}}}',
 2787+{{canonicalurl:{{MediaWiki:Helppage}}}}',
27862788
27872789 # Delete
27882790 'deletepage' => 'Delete page',

Sign-offs

UserFlagDate
Nikerabbitinspected14:43, 19 August 2011
Brion VIBBERinspected23:27, 23 August 2011

Follow-up revisions

RevisionCommit summaryAuthorDate
r94997Followup r94995: add hooks.txt entries for the new hookscatrope14:05, 19 August 2011
r95000Use canonical URLs (introduced in r94995) in the OpenSearch discovery thingycatrope14:39, 19 August 2011
r95002Use canonical URLs (introduced in r94995) for all URLs in e-mailscatrope14:46, 19 August 2011
r955051.17wmf1: Merge a truckload of HTTPS / prot rel URL fixes: r93847, r94990, r9...catrope19:32, 25 August 2011
r95607MFT to REL1_18...hashar19:28, 27 August 2011
r95627Fix bug in r94995: getCanonicalUrl() doesn't append the fragment. This is cor...catrope15:15, 28 August 2011
r96180Use wfExpandUrl+PROTO_CANONICAL inside of getCanonicalURL instead of just pre...dantman13:17, 3 September 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r44406(bug 4253, bug 16586) Don't repeat titles of new pages in URLs in the RC->IRC...vyznev15:56, 10 December 2008
r44412Seems r44406 would've broken the RC feed on secure.wikimedia org: fix by intr...vyznev17:03, 10 December 2008
r94990Followup r94754: move protocol expansion from getIRCLine() to inside getInter...catrope08:56, 19 August 2011

Comments

#Comment by Brion VIBBER (talk | contribs)   23:26, 23 August 2011

Looks good overall! Doc comments I think need to be tweaked to indicate $query can be an array, as it is on getLocalURL etc.

Deployment configuration in secure.php will need to be updated to continue supporting the secure.wikimedia.org gateway:

  • apply the GetInternalURL hook handler to GetCanonicalURL as well (to collapse the expanded path back to the regular path)
  • equivalent function for IRCLineURL
  • set $wgCanonicalServer to the $wgInternalServer value

Note that we may not want to set $wgCanonicalServer to use 'https://', as it'll leave the same problems we have now with links in notification emails depending on what someone *else* did instead of what *you* did.

In the future though I recommend setting it always to use 'https://', for a magical future where we run all login sessions of SSL like sane people. :)

#Comment by Catrope (talk | contribs)   09:14, 24 August 2011
  • Doc comments: sure, will tweak
  • secure.php: I know, that's why there's a big fat scaptrap on this :)
  • CanonicalServer: maybe I didn't document this well enough, but $wgCanonicalServer must always be the same, regardless of which protocol the user comes in on. This ensures consistency in e-mails, but also prevents cache pollution. We can choose to set it to http (which we will start out with) and we can switch it to https when we feel comfortable to, but the important thing is that it's set to a preferred protocol which is fixed. Setting wgCanonicalServer to https won't reintroduce the notification e-mail problems, it'll simply make all links in notification e-mails show https for everyone, so things don't depend on what someone else did, but on what we decide should be the site-wide default
#Comment by Brion VIBBER (talk | contribs)   18:37, 24 August 2011

Right -- having $wgCanonicalServer differ between http and https is what would be problematic. Just writing these things down explicitly so we don't forget when the scaptrap comes due. ;)

#Comment by Krinkle (talk | contribs)   23:13, 24 August 2011

See also bug 30398

#Comment by Dantman (talk | contribs)   22:04, 2 September 2011

getLocalURL can return absolute and protocol-relative urls (interwikis and action=render), we should probably take care to avoid prepending in those cases. Though for protocol-relatives returned we may want to add the same proto as $wgCanonicalServer.

Perhaps we should add a PROTO_CANONICAL to wfExpandUrl and use that here.

#Comment by Catrope (talk | contribs)   07:54, 5 September 2011

Hmm, right. I just copied the code from getInternalUrl(). I have, in fact, already added PROTO_CANONICAL in r95006, so I'll just use it here.

#Comment by Catrope (talk | contribs)   10:53, 5 September 2011

Dan already did it in r96180.

Status & tagging log