r89003 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r89002‎ | r89003 | r89004 >
Date:22:31, 27 May 2011
Author:brion
Status:ok (Comments)
Tags:
Comment:
* (bug 29174) Fix regression in upload-by-URL: files larger than PHP memory limit work again

r65152 switched upload-by-URL ($wgAllowCopyUploads) to use Http / MwHttpRequest class instead of CURL directly.
While this is mostly nice, it switched from saving large files directly to a temp file to buffering them in memory, causing large files to fail when they hit the PHP memory limit.

Fix uses MwHttpRequest's callback capability to override the read handler; now appending it to the temporary file as we go, and can confirm that largish files work again; was able to upload a 64mb .ogv that previously didn't work for me: http://prototype.wikimedia.org/tmh/images/b/b2/File-Arborophila_brunneopectus_pair_feeding_-_Kaeng_Krachan.ogv

Also expanded the documentation on MwHttpRequest::setCallback to clarify the function parameters and return value for the callback (which currently matches the low-level CURL handler's callback directly).
Note that the non-CURL implementation doesn't abort the read if the callback doesn't return the expected number of bytes, but this is an immediate fatal end of request on the Curl backend. May want further cleanup.
Modified paths:
  • /trunk/phase3/includes/HttpFunctions.php (modified) (history)
  • /trunk/phase3/includes/upload/UploadFromUrl.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/upload/UploadFromUrl.php
@@ -105,41 +105,62 @@
106106 protected function makeTemporaryFile() {
107107 return tempnam( wfTempDir(), 'URL' );
108108 }
 109+
109110 /**
110 - * Save the result of a HTTP request to the temporary file
 111+ * Callback: save a chunk of the result of a HTTP request to the temporary file
111112 *
112 - * @param $req MWHttpRequest
113 - * @return Status
 113+ * @param $req mixed
 114+ * @param $buffer string
 115+ * @return int number of bytes handled
114116 */
115 - private function saveTempFile( $req ) {
116 - if ( $this->mTempPath === false ) {
117 - return Status::newFatal( 'tmp-create-error' );
 117+ public function saveTempFileChunk( $req, $buffer ) {
 118+ $nbytes = fwrite( $this->mTmpHandle, $buffer );
 119+
 120+ if ( $nbytes == strlen( $buffer ) ) {
 121+ $this->mFileSize += $nbytes;
 122+ } else {
 123+ // Well... that's not good!
 124+ fclose( $this->mTmpHandle );
 125+ $this->mTmpHandle = false;
118126 }
119 - if ( file_put_contents( $this->mTempPath, $req->getContent() ) === false ) {
120 - return Status::newFatal( 'tmp-write-error' );
121 - }
122127
123 - $this->mFileSize = filesize( $this->mTempPath );
 128+ return $nbytes;
 129+ }
124130
125 - return Status::newGood();
126 - }
127131 /**
128132 * Download the file, save it to the temporary file and update the file
129133 * size and set $mRemoveTempFile to true.
130134 */
131135 protected function reallyFetchFile() {
 136+ if ( $this->mTempPath === false ) {
 137+ return Status::newFatal( 'tmp-create-error' );
 138+ }
 139+
 140+ // Note the temporary file should already be created by makeTemporaryFile()
 141+ $this->mTmpHandle = fopen( $this->mTempPath, 'wb' );
 142+ if ( !$this->mTmpHandle ) {
 143+ return Status::newFatal( 'tmp-create-error' );
 144+ }
 145+
 146+ $this->mRemoveTempFile = true;
 147+ $this->mFileSize = 0;
 148+
132149 $req = MWHttpRequest::factory( $this->mUrl );
 150+ $req->setCallback( array( $this, 'saveTempFileChunk' ) );
133151 $status = $req->execute();
134152
135 - if ( !$status->isOk() ) {
136 - return $status;
 153+ if ( $this->mTmpHandle ) {
 154+ // File got written ok...
 155+ fclose( $this->mTmpHandle );
 156+ $this->mTmpHandle = null;
 157+ } else {
 158+ // We encountered a write error during the download...
 159+ return Status::newFatal( 'tmp-write-error' );
137160 }
138161
139 - $status = $this->saveTempFile( $req );
140 - if ( !$status->isGood() ) {
 162+ if ( !$status->isOk() ) {
141163 return $status;
142164 }
143 - $this->mRemoveTempFile = true;
144165
145166 return $status;
146167 }
Index: trunk/phase3/includes/HttpFunctions.php
@@ -316,11 +316,26 @@
317317 }
318318
319319 /**
320 - * Set the callback
 320+ * Set a read callback to accept data read from the HTTP request.
 321+ * By default, data is appended to an internal buffer which can be
 322+ * retrieved through $req->getContent().
321323 *
 324+ * To handle data as it comes in -- especially for large files that
 325+ * would not fit in memory -- you can instead set your own callback,
 326+ * in the form function($resource, $buffer) where the first parameter
 327+ * is the low-level resource being read (implementation specific),
 328+ * and the second parameter is the data buffer.
 329+ *
 330+ * You MUST return the number of bytes handled in the buffer; if fewer
 331+ * bytes are reported handled than were passed to you, the HTTP fetch
 332+ * will be aborted.
 333+ *
322334 * @param $callback Callback
323335 */
324336 public function setCallback( $callback ) {
 337+ if ( !is_callable( $callback ) ) {
 338+ throw new MWException( 'Invalid MwHttpRequest callback' );
 339+ }
325340 $this->callback = $callback;
326341 }
327342

Sign-offs

UserFlagDate
Tim Starlinginspected05:03, 17 June 2011

Follow-up revisions

RevisionCommit summaryAuthorDate
r896771.17: MFT r88492, r88870, r88871, r89003, r89108, r89114, r89115, r89129, r89...catrope19:14, 7 June 2011
r922891.17wmf1: Redoing r92287, merging r89003, r89005 into 1.17wmf1reedy19:28, 15 July 2011
r92330REL1_18 MFT r88750, r88870, r88871, r89003, r89005, r89114, r89115, r89129, r...reedy22:56, 15 July 2011

Past revisions this follows-up on

RevisionCommit summaryAuthorDate
r65152* New UploadFromUrlJob class to handle Upload-by-Copy...mah02:43, 17 April 2010

Comments

#Comment by Brion VIBBER (talk | contribs)   22:33, 27 May 2011

Needs merging to 1.17, 1.18 release branches.

Status & tagging log