r78585 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r78584‎ | r78585 | r78586 >
Date:23:07, 18 December 2010
Author:happy-melon
Status:reverted (Comments)
Tags:
Comment:
Merge in Token class from my branch; would like some second opinions on the concept. This seeks to replace the much-abused $wgUser->editToken() token generator, 90% of the uses of which are not for editing but rather general actions.

Implemented for rollback mainly as a proof-of-concept; obvious further targets are patrol links and HTMLForm.
Modified paths:
  • /trunk/phase3/includes/Article.php (modified) (history)
  • /trunk/phase3/includes/AutoLoader.php (modified) (history)
  • /trunk/phase3/includes/Linker.php (modified) (history)
  • /trunk/phase3/includes/Token.php (added) (history)
  • /trunk/phase3/includes/api/ApiQueryRevisions.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/Article.php
@@ -3332,7 +3332,8 @@
33333333 $rollbackErrors = $this->mTitle->getUserPermissionsErrors( 'rollback', $wgUser );
33343334 $errors = array_merge( $editErrors, wfArrayDiff2( $rollbackErrors, $editErrors ) );
33353335
3336 - if ( !$wgUser->matchEditToken( $token, array( $this->mTitle->getPrefixedText(), $fromP ) ) ) {
 3336+ $t = new Token( Token::PERSISTENT, array( $this->mTitle->getPrefixedText(), $fromP ) );
 3337+ if ( !$t->match( $token ) ) {
33373338 $errors[] = array( 'sessionfailure' );
33383339 }
33393340
Index: trunk/phase3/includes/Token.php
@@ -0,0 +1,217 @@
 2+<?php
 3+/**
 4+ * Deal with importing all those nasssty globals and things
 5+ *
 6+ * Copyright © 2003 Brion Vibber <brion@pobox.com>
 7+ * http://www.mediawiki.org/
 8+ *
 9+ * This program is free software; you can redistribute it and/or modify
 10+ * it under the terms of the GNU General Public License as published by
 11+ * the Free Software Foundation; either version 2 of the License, or
 12+ * (at your option) any later version.
 13+ *
 14+ * This program is distributed in the hope that it will be useful,
 15+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
 16+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 17+ * GNU General Public License for more details.
 18+ *
 19+ * You should have received a copy of the GNU General Public License along
 20+ * with this program; if not, write to the Free Software Foundation, Inc.,
 21+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 22+ * http://www.gnu.org/copyleft/gpl.html
 23+ *
 24+ * @file
 25+ */
 26+
 27+/**
 28+ * CSRF attacks (where a malicious website uses frames, <img> tags, or
 29+ * similar, to prompt a wiki user to open a wiki page or submit a form,
 30+ * without being aware of doing so) are most easily countered by using
 31+ * tokens. For normal browsing, loading the form for a protected action
 32+ * sets two copies of a random string: one in the $_SESSION, and one as
 33+ * a hidden field in the form. When the form is submitted, it checks
 34+ * that a) the set of cookies submitted with the form *has* a copy of
 35+ * the session cookie, and b) that it matches. Since malicious websites
 36+ * don't have control over the session cookies, they can't craft a form
 37+ * that can be instantly submitted which will have the appropriate tokens.
 38+ *
 39+ * Note that these tokens are distinct from those in User::setToken(), which
 40+ * are used for persistent session authentication and are retained for as
 41+ * long as the user is logged in to the wiki. These tokens are to protect
 42+ * one individual action, and should ideally be cleared once the action is over.
 43+ */
 44+class Token {
 45+
 46+ /*
 47+ * Some punctuation to prevent editing from broken
 48+ * text-mangling proxies.
 49+ */
 50+ const TOKEN_SUFFIX = '+\\';
 51+
 52+ /**
 53+ * Different tokens for different types of action.
 54+ *
 55+ * We don't store tokens for some actions for anons
 56+ * so they can still do things when they have cookies disabled.
 57+ * So either use this for actions which anons can't access, or
 58+ * where you don't mind an attacker being able to trigger the action
 59+ * anonymously from the user's IP. However, the token is still
 60+ * useful because it fails with some broken proxies.
 61+ */
 62+ const ANONYMOUS = 'Edit';
 63+
 64+ /**
 65+ * For actions requiring a medium level of protection, or where the
 66+ * user will be making repeated actions: this token should not be
 67+ * cleared once the action is completed. For instance, a user might
 68+ * revert mass vandalism from a user by loading their contribs and
 69+ * ctrl+clicking each rollback link. If we cleared the Token from
 70+ * session after each rollback, they'd have to reload the contribs
 71+ * page each time, which would be annoying.
 72+ */
 73+ const PERSISTENT = 'Action';
 74+
 75+ /**
 76+ * For actions requiring a high level of protection, and where the user
 77+ * will not be performing multiple sequential actions without reloading
 78+ * the form or link. Eg login, block/protect/delete, userrights, etc.
 79+ * Callers should clear these tokens upon completion of the action, and
 80+ * other callers should expect that they will be cleared.
 81+ */
 82+ const UNIQUE = 'Unique';
 83+
 84+ /**
 85+ * String the action which is being protected by the token
 86+ * ('edit', 'login', 'rollback', etc)
 87+ */
 88+ protected $type = self::ANONYMOUS;
 89+
 90+ /**
 91+ * An instance-specific salt. So if you want to generate a hundred rollback
 92+ * tokens for the watchlist, pass a $salt which is unique
 93+ * to each revision. Only one token is stored in the session, but it is munged
 94+ * with a different salt for each revision, so the required value in the HTML
 95+ * is different for each case.
 96+ */
 97+ protected $salt = '';
 98+
 99+ protected $request;
 100+
 101+ /**
 102+ * Constructor
 103+ * @param $salt String an instance-specific salt. @see Token::$salt
 104+ * @param $type Token class constant identifier
 105+ * @param $request WebRequest most of the time you'll want to get/store
 106+ * the tokens in $wgRequest, which is the default.
 107+ */
 108+ public function __construct( $salt, $type = self::ANONYMOUS, WebRequest $request = null ){
 109+ global $wgRequest;
 110+ $this->type = $type;
 111+
 112+ if( is_array( $this->salt ) ) {
 113+ $this->salt = implode( '|', $this->salt );
 114+ } else {
 115+ $this->salt = strval( $salt );
 116+ }
 117+
 118+ $this->request = $request instanceof WebRequest
 119+ ? $request
 120+ : $wgRequest;
 121+ }
 122+
 123+ /**
 124+ * Ensure that a token is set in cookies, by setting a new one
 125+ * if necessary.
 126+ * @param $purge Bool whether to overwrite an existing token in
 127+ * session if there is one. This is more secure, but will
 128+ * only allow one Token of a particular $action to be used on
 129+ * the page (which may itself be a good thing).
 130+ * @return String The version of the token which should be included
 131+ * in the HTML form/link.
 132+ */
 133+ public function set( $purge = false ) {
 134+ global $wgUser;
 135+ if ( $this->type == self::ANONYMOUS && $wgUser->isAnon() ) {
 136+ return self::TOKEN_SUFFIX;
 137+ }
 138+
 139+ if( $purge || $this->get() === null ){
 140+ $token = self::generate();
 141+ if( session_id() == '' ) {
 142+ wfSetupSession();
 143+ }
 144+ $this->store( $token );
 145+ } else {
 146+ $token = $this->get();
 147+ }
 148+
 149+ return md5( $token . $this->salt ) . self::TOKEN_SUFFIX;
 150+ }
 151+
 152+ /**
 153+ * Check whether the copy of the token submitted with a form
 154+ * matches the version stored in session
 155+ * @param $val String version submitted with the form.
 156+ * @return Mixed null if no session token was set, Bool false if there
 157+ * was a token but it didn't match, Bool true if it matched correctly
 158+ */
 159+ public function match( $val ){
 160+ global $wgUser;
 161+ if( $this->type == self::ANONYMOUS && $wgUser->isAnon() ){
 162+ return $val === self::TOKEN_SUFFIX;
 163+ }
 164+
 165+ if( $this->get() === null ){
 166+ return null;
 167+ }
 168+
 169+ return md5( $this->get() . $this->salt ) . self::TOKEN_SUFFIX === $val;
 170+ }
 171+
 172+ /**
 173+ * Delete the token after use, so it can't be used again. This will
 174+ * invalidate all tokens for this Token's action type.
 175+ */
 176+ public function clear(){
 177+ $this->store( null );
 178+ }
 179+
 180+ /**
 181+ * Prepare a new Token for a given action, set it in session, and
 182+ * return the value we need to pass in the HTML
 183+ * @param $salt String
 184+ * @param $type Token class constant identifier
 185+ * @return String token string to store in HTML
 186+ */
 187+ public static function prepare( $salt, $type = self::ANONYMOUS ){
 188+ $t = new Token( $salt, $type );
 189+ return $t->set( false );
 190+ }
 191+
 192+ /**
 193+ * Generate a random token
 194+ * @param $salt String Optional salt value
 195+ * @return String 32-char random token
 196+ */
 197+ protected static function generate( $salt = '' ) {
 198+ $rand = dechex( mt_rand() ) . dechex( mt_rand() );
 199+ return md5( $rand . $salt );
 200+ }
 201+
 202+ /**
 203+ * Set the given token for the given action in the session
 204+ * @param $token String
 205+ * @param $action String
 206+ */
 207+ protected function store( $token ){
 208+ $this->request->setSessionData( "ws{$this->type}Token", $token );
 209+ }
 210+
 211+ /**
 212+ * Get the token set for a given action
 213+ * @return String or null if no token was stored in the session
 214+ */
 215+ protected function get(){
 216+ return $this->request->getSessionData( "ws{$this->type}Token" );
 217+ }
 218+}
Property changes on: trunk/phase3/includes/Token.php
___________________________________________________________________
Added: svn:eol-style
1219 + native
Index: trunk/phase3/includes/Linker.php
@@ -1499,17 +1499,23 @@
15001500 $title = $rev->getTitle();
15011501 $query = array(
15021502 'action' => 'rollback',
1503 - 'from' => $rev->getUserText()
 1503+ 'from' => $rev->getUserText(),
 1504+ 'token' => Token::prepare(
 1505+ Token::PERSISTENT,
 1506+ array( $title->getPrefixedText(), $rev->getUserText() )
 1507+ ),
15041508 );
15051509 if ( $wgRequest->getBool( 'bot' ) ) {
15061510 $query['bot'] = '1';
15071511 $query['hidediff'] = '1'; // bug 15999
15081512 }
1509 - $query['token'] = $wgUser->editToken( array( $title->getPrefixedText(),
1510 - $rev->getUserText() ) );
1511 - return $this->link( $title, wfMsgHtml( 'rollbacklink' ),
 1513+ return $this->link(
 1514+ $title,
 1515+ wfMsgHtml( 'rollbacklink' ),
15121516 array( 'title' => wfMsg( 'tooltip-rollback' ) ),
1513 - $query, array( 'known', 'noclasses' ) );
 1517+ $query,
 1518+ array( 'known', 'noclasses' )
 1519+ );
15141520 }
15151521
15161522 /**
Index: trunk/phase3/includes/api/ApiQueryRevisions.php
@@ -78,8 +78,10 @@
7979 if ( !$wgUser->isAllowed( 'rollback' ) ) {
8080 return false;
8181 }
82 - return $wgUser->editToken( array( $title->getPrefixedText(),
83 - $rev->getUserText() ) );
 82+ return Token::prepare(
 83+ Token::PERSISTENT,
 84+ array( $title->getPrefixedText(), $rev->getUserText() )
 85+ );
8486 }
8587
8688 public function execute() {
Index: trunk/phase3/includes/AutoLoader.php
@@ -238,6 +238,7 @@
239239 'TitleArray' => 'includes/TitleArray.php',
240240 'TitleArrayFromResult' => 'includes/TitleArray.php',
241241 'TitleListDependency' => 'includes/CacheDependency.php',
 242+ 'Token' => 'includes/Token.php',
242243 'UnlistedSpecialPage' => 'includes/SpecialPage.php',
243244 'User' => 'includes/User.php',
244245 'UserArray' => 'includes/UserArray.php',

Follow-up revisions

RevisionCommit summaryAuthorDate
r78586Lol, I don't think that's part of the standard GPL header... :-D (follow-up r...happy-melon23:09, 18 December 2010
r78599Follow-up r78585: Make Token::PERSISTENT the default, so no need to specify i...happy-melon15:23, 19 December 2010
r78631Revert rollback implementation of r78585, r78599. The way the API is set up,...happy-melon19:03, 20 December 2010
r82306Revert r78585. While I definitely think this is an area where there's scope ...happy-melon22:29, 16 February 2011

Comments

#Comment by Platonides (talk | contribs)   16:34, 19 December 2010
  • For instance, a user might
* revert mass vandalism from a user by loading their contribs and
* ctrl+clicking each rollback link. If we cleared the Token from
* session after each rollback, they'd have to reload the contribs
* page each time, which would be annoying. 

This is a quite bad example, since the rollback links are prefixed with the page name and the rolledback user precisely to make them unique so that they only work on that exact page. There's not point on keeping them. They are for using once and discard. I also miss a mention on why those kind of tokens must be discarded (since they are GET request, they get logged in proxys and are it's thus unsafe to reuse them).


+	/**
+	 * String the action which is being protected by the token 
+	 * ('edit', 'login', 'rollback', etc)
+	 */
+	protected $type = self::ANONYMOUS;

There seems to be a mixture between actions and security levels. From reading the code, $type identifies the security level, and all 'login', 'rollback', etc. would be of type PERSISTENT. The comment should be fixed.

+ global $wgUser; Would be nice if the user was handled similar to Token::request. Make the user the third parameter and move $request to the fourth.


I'd expect that a UNIQUE token would be automatically cleared once it matches.

A user should be able to load two different Unique actions on different tabs and perform both.

#Comment by Happy-melon (talk | contribs)   17:01, 19 December 2010

This is a quite bad example, since the rollback links are prefixed with the page name and the rolledback user precisely to make them unique so that they only work on that exact page. There's not point on keeping them.

But for all the rollback links on the page, there is only one key stored in the session, in both the old and new systems. It's salted in different ways for each instance so that they're all unique, but for each link it's retrieving the same token from the session, and comparing it in different ways. If you cleared the session token after the first rollback, all the other rollback links would stop working because there was no session value to retrieve.

They are for using once and discard. I also miss a mention on why those kind of tokens must be discarded (since they are GET request, they get logged in proxys and are it's thus unsafe to reuse them).

The part of the token which is sent in the GET request is the non-session part; it's the bit which would be set in a hidden field in a POSTed form. When the user clicks on the rollback link, the checker compares the token in the GET to token-in-session-hashed-with-the-unique-salt.

There seems to be a mixture between actions and security levels. From reading the code, $type identifies the security level, and all 'login', 'rollback', etc. would be of type PERSISTENT.

I made a change partway through to three different types of tokens rather than a different type for each and every action.

+ global $wgUser; Would be nice if the user was handled similar to Token::request. Make the user the third parameter and move $request to the fourth.

You can't really get tokens for anyone other than the current user, because you can only access the current user's session. Is there anywhere in code where this would be needed?

I'd expect that a UNIQUE token would be automatically cleared once it matches.

Good idea.

A user should be able to load two different Unique actions on different tabs and perform both.

That could be a challenge... Might need different session elements for each UNIQUE token; maybe based on the salt...?

#Comment by Platonides (talk | contribs)   17:57, 19 December 2010

But for all the rollback links on the page, there is only one key stored in the session, in both the old and new systems. It's salted in different ways for each instance so that they're all unique, but for each link it's retrieving the same token from the session, and comparing it in different ways. If you cleared the session token after the first rollback, all the other rollback links would stop working because there was no session value to retrieve.

You are right. The relationship between token instances as seen from the user and Token objects is not injective. Maybe we could rename the token class to avoid confusing them?

Also, prepare() is a three items singleton since all Token objects with the same type are equivalent. So you create several token objects but they are magically the same (that bit must be kept into account regarding clearing). And if you want to use the same master token with a different salt, you must create a new object. This doesn't seem a good design. The salt could be a member parameter, or there could be a master token class, which generates token instances which then have different salts...


You can't really get tokens for anyone other than the current user, because you can only access the current user's session. Is there anywhere in code where this would be needed?

The objective was to reduce usage of globals in new code.

That could be a challenge... Might need different session elements for each UNIQUE token; maybe based on the salt...?

There could be a class in the session which generates new tokens based on the required security and tracks the issued ones.

Or we could scrap all of this. Can you make a list of the 'problems it tries to solve'? There are some shortcoming with the current approach (eg. actions needing a token even for anons), but so far it has worked quite well. This may be making things too much complex.

#Comment by Happy-melon (talk | contribs)   18:23, 19 December 2010

You are right. The relationship between token instances as seen from the user and Token objects is not injective. Maybe we could rename the token class to avoid confusing them?

I don't entirely understand what you're saying here. Are you saying that the implementation is different between the old and new, or between two facets of the new? Why do you think a name change would help?

And if you want to use the same master token with a different salt, you must create a new object. This doesn't seem a good design. The salt could be a member parameter, or there could be a master token class, which generates token instances which then have different salts...

But where do we ever need to do this? We don't get multiple tokens at the same level in the callstack; either we only need one token for the page, or we need to get one token in the course of preparing one object, which we may do multiple times. There's no need to be able to reuse Token objects. Or am I missing a usecase?

The objective was to reduce usage of globals in new code.

But it doesn't reduce the usage of globals at all, because you either have to get $wgUser in the constructor, or get $wgUser in every caller and pass it to the constructor. I don't see how that has practical advantages.

Or we could scrap all of this. Can you make a list of the 'problems it tries to solve'? There are some shortcoming with the current approach (eg. actions needing a token even for anons), but so far it has worked quite well. This may be making things too much complex.

Primarily it's a refactoring to try and improve readability; but it should also make it clearer when and where tokens are required, and how sensitive they are. There is a security consideration in that many tokens are currently unsalted: if you sniff a copy of an admin's Special:MovePage form, you can use the same token to CSRF a Special:BlockIP form, as they're both unsalted. And virtually no extensions use tokens at all.

I'm genuinely unsure about whether the best approach is the $type system in there now, or the $action method (different session tokens for 'rollback', 'login', etc) that I had before; thoughts are welcome. Tim's closure of the CSRF in Special:UserLogin indicated that it's valuable for the login token not to be the same as the edit token, but I don't know whether it's better to extend that to having different tokens for each action, or to keep the session a bit cleaner by not having so many different tokens.

Putting an object in the session to act as a 'token jar' might work...

#Comment by Platonides (talk | contribs)   18:51, 19 December 2010

For me the token is the aabbccddee... string sent to the user. When making a list of rollback links, you are creating several tokens.

With $tokenJar->generate( $salt ) I think it's clearer to see that you are creating a token from a given source.

than noting that (new Token( Token::PERSISTENT, 'Main Page|Happy-melon') )->clean(); will also delete new Token( Token::PERSISTENT, 'WP:RFA|Jimmy Wales')

I'm genuinely unsure about whether the best approach is the $type system in there now, or the $action method (different session tokens for 'rollback', 'login', etc) that I had before; thoughts are welcome. Tim's closure of the CSRF in Special:UserLogin indicated that it's valuable for the login token not to be the same as the edit token, but I don't know whether it's better to extend that to having different tokens for each action, or to keep the session a bit cleaner by not having so many different tokens.

Note that for Special:UserLogin, $wgUser->editToken() wouldn't have worked since it would always return +\

I don't see where you are getting that indication from r64677. Changing tokens for each action is indeed a bit more secure, but try not to be obnoxious. I'm sure it will break bots and scripts, so too much different tokens (eg. onj moving: one per source page) wouldn't be appropiate.

#Comment by Happy-melon (talk | contribs)   17:52, 20 December 2010

For me the token is the aabbccddee... string sent to the user. When making a list of rollback links, you are creating several tokens.

I think that way of looking at it is a mistake, and masks the whole point of the tokens which is that they are in two parts, one in the session and one in the HTML request.

Currently if you get given a rollback link with &token=f0ObAr, that's because you have a session token "baZqu0k", and hashing that session token together with the title of the page and the user to rollback gives you the HTML token. Currently there is no method to clear the session part of the token, because that would invalidate pretty much every token the user has, for rollback, block, delete, etc; login is the only exception. But without that invalidation, the token &token=f0ObAr can still be successfully used to CSRF a rollback to the same page from the same user for as long as the user stays logged in, which is obviously undesirable. But it's a mistake to only think about the HTML part of the token, because you can't "invalidate" that; the way you 'invalidate' a token is to change or remove the session component of it, which invalidates all the HTML tokens using it (which is currently everything).

A solution would be to switch to a one-to-one mapping between session tokens and HTML tokens - so every rollback link generated would store a new session token. That doesn't change the behaviour from the perspective of bots and scripts, since they already need to ask for the HTML token for every individual rollback they perform. Since the session data is stored on the server, not the client, end-users wouldn't see any change at all. But on the server side we'd need to have a TokenJar to garbage-collect unused tokens after a timeout, because they wouldn't be cleared unless they were 'used'.

Essentially there are two elements to this: making sure that each HTML token given out is unique, which is done by salting correctly; and making HTML tokens single-use, which is done by having separate corresponding session tokens and clearing them after use. Which do you think are important?

#Comment by Platonides (talk | contribs)   18:11, 20 December 2010

I think that way of looking at it is a mistake, and masks the whole point of the tokens which is that they are in two parts, one in the session and one in the HTML request.

Only calling 'f0ObAr' a token just means that I want to call 'baZqu0k' something different: Master token, TokenGenerator. token jar...

I agree with your following paragraph. Note that the rollback issue can be fixed with a minor funcionality change by adding the revision id to the salt, but it's not so easy for other forms. Basically everything which the user can change in the final form (eg. the target user to block), means that we can't use it for token salting.

By breaking scripts I wasn't refering rollback links. I doubt any change on them them will break any script.

We could easily make tokens expire by adding them in the objectache prefixed by the userid and a token prefix. Now, what would be a sensible time for that?

#Comment by Platonides (talk | contribs)   17:59, 20 December 2010

Breaks the ApiWatchTest::testWatchRollback phpunit test: Received error badtoken

#Comment by Happy-melon (talk | contribs)   18:47, 20 December 2010

Hmm, $wgUser->editToken() is called in ApiMain::setupModule(), the same way for all modules. So it wouldn't be (easily) possible to separate rollback from the other token-controlled actions in the API and have some on the new system and some on the old.

#Comment by Happy-melon (talk | contribs)   19:04, 20 December 2010

Reverted rollback implementation in r78631.

Status & tagging log