r57286 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r57285‎ | r57286 | r57287 >
Date:10:13, 2 October 2009
Author:daniel
Status:ok (Comments)
Tags:todo 
Comment:
importImages --skip-dupes checks for dupes using sha1
Modified paths:
  • /trunk/phase3/maintenance/importImages.php (modified) (history)

Diff [purge]

Index: trunk/phase3/maintenance/importImages.php
@@ -124,6 +124,19 @@
125125 continue;
126126 }
127127 } else {
 128+ if ( isset( $options['skip-dupes'] ) ) {
 129+ $repo = $image->getRepo();
 130+ $sha1 = File::sha1Base36( $file ); #XXX: we end up calculating this again when actually uploading. that sucks.
 131+
 132+ $dupes = $repo->findBySha1( $sha1 );
 133+
 134+ if ( $dupes ) {
 135+ echo( "{$base} already exists as " . $dupes[0]->getName() . ", skipping\n" );
 136+ $skipped++;
 137+ continue;
 138+ }
 139+ }
 140+
128141 echo( "Importing {$base}..." );
129142 $svar = 'added';
130143 }
@@ -253,6 +266,7 @@
254267 --limit=<num> Limit the number of images to process. Ignored or skipped images are not counted.
255268 --from=<name> Ignore all files until the one with the given name. Useful for resuming
256269 aborted imports. <name> should be the file's canonical database form.
 270+--skip-dupes Skip images that were already uploaded under a different name (check SHA1)
257271 --sleep=<sec> Sleep between files. Useful mostly for debugging.
258272 --user=<username> Set username of uploader, default 'Maintenance script'
259273 --check-userblock Check if the user got blocked during import.

Comments

#Comment by Brion VIBBER (talk | contribs)   18:47, 6 October 2009

Might be good to do the dupe check globally to check for shared images when we're uploading to a non-main repo.

Status & tagging log