r62087 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r62086‎ | r62087 | r62088 >
Date:16:10, 7 February 2010
Author:platonides
Status:deferred (Comments)
Tags:
Comment:
Applied and tweaked smart-import patch from Mij, merging with 1.16.
This adds --source-wiki-url parameter to importImages.php for a wiki
from which to fetch the original uploader and comment.
Useful when going from local uploads to a shared repository.

Original code at http://www.howtopedia.org/public/mw-smart-import.tbz

"I release it with the least restrictive license applicable,
considering the code is derivative work of the respective
maintenance scripts from the MW distribution.
Feel free to commit this or modifications of it to your repos."
http://lists.wikimedia.org/pipermail/mediawiki-l/2010-February/033230.html
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/maintenance/importImages.inc (modified) (history)
  • /trunk/phase3/maintenance/importImages.php (modified) (history)

Diff [purge]

Index: trunk/phase3/maintenance/importImages.inc
@@ -6,6 +6,7 @@
77 * @file
88 * @ingroup Maintenance
99 * @author Rob Church <robchur@gmail.com>
 10+ * @author Mij <mij@bitchx.it>
1011 */
1112
1213 /**
@@ -86,4 +87,26 @@
8788 }
8889
8990 return false;
90 -}
\ No newline at end of file
 91+}
 92+
 93+# FIXME: Access the api in a saner way and performing just one query (preferably batching files too).
 94+function getFileCommentFromSourceWiki($wiki_host, $file) {
 95+ $url = $wiki_host . '/api.php?action=query&format=xml&titles=File:' . $file . '&prop=imageinfo&&iiprop=comment';
 96+ $body = file_get_contents($url);
 97+ if (preg_match('#<ii comment="([^"]*)" />#', $body, $matches) == 0) {
 98+ return false;
 99+ }
 100+
 101+ return $matches[1];
 102+}
 103+
 104+function getFileUserFromSourceWiki($wiki_host, $file) {
 105+ $url = $wiki_host . '/api.php?action=query&format=xml&titles=File:' . $file . '&prop=imageinfo&&iiprop=user';
 106+ $body = file_get_contents($url);
 107+ if (preg_match('#<ii user="([^"]*)" />#', $body, $matches) == 0) {
 108+ return false;
 109+ }
 110+
 111+ return $matches[1];
 112+}
 113+
Index: trunk/phase3/maintenance/importImages.php
@@ -2,16 +2,24 @@
33
44 /**
55 * Maintenance script to import one or more images from the local file system into
6 - * the wiki without using the web-based interface
 6+ * the wiki without using the web-based interface.
77 *
 8+ * "Smart import" additions:
 9+ * - aim: preserve the essential metadata (user, description) when importing medias from an existing wiki
 10+ * - process:
 11+ * - interface with the source wiki, don't use bare files only (see --source-wiki-url).
 12+ * - fetch metadata from source wiki for each file to import.
 13+ * - commit the fetched metadata to the destination wiki while submitting.
 14+ *
815 * @file
916 * @ingroup Maintenance
1017 * @author Rob Church <robchur@gmail.com>
 18+ * @author Mij <mij@bitchx.it>
1119 */
1220
13 -$optionsWithArgs = array( 'extensions', 'comment', 'comment-file', 'comment-ext', 'user', 'license', 'sleep', 'limit', 'from' );
 21+$optionsWithArgs = array( 'extensions', 'comment', 'comment-file', 'comment-ext', 'user', 'license', 'sleep', 'limit', 'from', 'source-wiki-url' );
1422 require_once( dirname(__FILE__) . '/commandLine.inc' );
15 -require_once( 'importImages.inc' );
 23+require_once( dirname(__FILE__) . '/importImages.inc.php' );
1624 $processed = $added = $ignored = $skipped = $overwritten = $failed = 0;
1725
1826 echo( "Import Images\n\n" );
@@ -141,36 +149,51 @@
142150 $svar = 'added';
143151 }
144152
145 - # Find comment text
146 - $commentText = false;
 153+ if (isset( $options['source-wiki-url'])) {
 154+ /* find comment text directly from source wiki, through MW's API */
 155+ $real_comment = getFileCommentFromSourceWiki($options['source-wiki-url'], $base);
 156+ if ($real_comment === false)
 157+ $commentText = $comment;
 158+ else
 159+ $commentText = $real_comment;
147160
148 - if ( $commentExt ) {
149 - $f = findAuxFile( $file, $commentExt );
150 - if ( !$f ) {
151 - echo( " No comment file with extension {$commentExt} found for {$file}. " );
152 - $commentText = $comment;
153 - } else {
154 - $commentText = file_get_contents( $f );
155 - if ( !$f ) {
156 - echo( " Failed to load comment file {$f}. " );
157 - $commentText = $comment;
158 - } else if ( $comment ) {
159 - $commentText = trim( $commentText ) . "\n\n" . trim( $comment );
160 - }
161 - }
162 - }
 161+ /* find user directly from source wiki, through MW's API */
 162+ $real_user = getFileUserFromSourceWiki($options['source-wiki-url'], $base);
 163+ if ($real_user === false) {
 164+ $wgUser = $user;
 165+ } else {
 166+ $wgUser = User::newFromName($real_user);
 167+ if ($wgUser === false) {
 168+ # user does not exist in target wiki
 169+ echo ("failed: user '$real_user' does not exist in target wiki.");
 170+ continue;
 171+ }
 172+ }
 173+ } else {
 174+ # Find comment text
 175+ $commentText = false;
163176
164 - if ( !$commentText ) {
165 - $commentText = $comment;
166 - }
 177+ if ( $commentExt ) {
 178+ $f = findAuxFile( $file, $commentExt );
 179+ if ( !$f ) {
 180+ echo( " No comment file with extension {$commentExt} found for {$file}, using default comment. " );
 181+ } else {
 182+ $commentText = file_get_contents( $f );
 183+ if ( !$f ) {
 184+ echo( " Failed to load comment file {$f}, using default comment. " );
 185+ }
 186+ }
 187+ }
167188
168 - if ( !$commentText ) {
169 - $commentText = 'Importing image file';
170 - }
 189+ if ( !$commentText ) {
 190+ $commentText = $comment;
 191+ }
 192+ }
171193
 194+
172195 # Import the file
173196 if ( isset( $options['dry'] ) ) {
174 - echo( " publishing {$file}... " );
 197+ echo( " publishing {$file} by '" . $wgUser->getName() . "', comment '$commentText'... " );
175198 } else {
176199 $archive = $image->publish( $file );
177200 if( WikiError::isError( $archive ) || !$archive->isGood() ) {
@@ -282,6 +305,8 @@
283306 --dry Dry run, don't import anything
284307 --protect=<protect> Specify the protect value (autoconfirmed,sysop)
285308 --unprotect Unprotects all uploaded images
 309+--source-wiki-url if specified, take User and Comment data for each imported file from this URL.
 310+ For example, --source-wiki-url="http://en.wikipedia.org/"
286311
287312 TEXT;
288313 exit(1);
Index: trunk/phase3/RELEASE-NOTES
@@ -318,6 +318,8 @@
319319 the return value
320320 * Separate unit test suites under t/ and tests/ were merged and moved to
321321 maintenance/tests/.
 322+* importImages.php maintenance script can now use the original uploader and
 323+comment from another wiki.
322324
323325 === Bug fixes in 1.16 ===
324326

Follow-up revisions

RevisionCommit summaryAuthorDate
r63693Fix regression from r62087 (!) breaking importImages.php . Backport to 1.16 f...catrope17:57, 13 March 2010
r73587Fix regression caused by r62087 which failed to insert rows into the image ta...overlordq02:59, 23 September 2010
r73754Revert r73587 and fix r62087 regression by providing the default value in $co...platonides16:58, 25 September 2010

Comments

#Comment by 😂 (talk | contribs)   16:23, 7 February 2010

Do not use file_get_contents() with remote URLs, as allow_url_fopen might be disabled. Use Http::get() or post() as appropriate, which will A) use cURL if available, and B) Not fail terribly when we can't make external requests.

#Comment by Platonides (talk | contribs)   18:41, 7 February 2010

Done on r62092. Although getFileCommentFromSourceWiki and getFileUserFromSourceWiki are just a big hack.

#Comment by OverlordQ (talk | contribs)   02:52, 23 September 2010

This actually breaks it when you dont specify a comment as well.

#Comment by Platonides (talk | contribs)   17:01, 25 September 2010

Fixed in r73754

Status & tagging log