I'll look into it later this week.
If I understand correctly, a loadbalancer factory should be considered session specific. The solution would then be to shutdown the current factory before executing jobs and creating a new factory for the jobs.
Could do, I guess. Why is it necessary to prevent jobs from accessing the session? What bug does that fix?
When using asynchronous upload by url, UploadFromUrlJob stores the result of the upload in the session, so that API clients can retrieve the status. This session is obviously not the same session as the request session, but that of the user who initiated the upload. It therefore calls wfSetupSession() with the session id of the uploader.
This would all be unnecessary of course if jobs had a way to communicate their results back, but currently there is none.
This seems like a good reason to not use the PHP session extension. I've been talking for a while about getting rid of it, it doesn't really provide much value for us. If we had an object-oriented view of a session, then you could create an object which accessed the uploader's session, without changing the definition of $_SESSION.
I'm not really keen about closing all database connections and then reopening them just to get some session-related side-effect.
A possible interim solution would be to implement an object-oriented session class which is compatible with PHP's session handling system. The object would read a given session file directly, unserialize it, provide access to the data via get/set methods, then then reserialize it and store it back to the file. Obviously if memcached sessions were enabled, it would use memcached as a backend instead of files.
Then the next step (maybe done by someone else) would be to use the new class for all session access, replacing $_SESSION.
Session files really are just files, and they're required to be writable by the web user so there's no problem with writing to them directly. The only tricky thing about it is the need to simulate the session-specific serialization function. For some reason, the session extension has its own format, and it doesn't provide a sensible interface to it.
I'm not really sure background download handling needs access to session data; my own inclination would be to have a table listing ongoing download jobs, which can be indexed by a token returned to the uploader.
Keeping it separate from the HTTP login session would also allow a separate login session by the same user to ping for status checks, which certainly seems plausible for huge-file transfers.
Totally agree. It seems like this was an abuse of session data to avoid having to figure out another way to track the background jobs.
So, what is your suggestion? Kill background uploading until somebody writes a way for jobs to communicate back? I'm fine with that.
Async uploading has been disabled for a couple versions already hasn't it?
It has been killed in 1.17. It can be killed from 1.18 and trunk as well if nobody bothers to write infrastructure to communicate back information from jobs. For this specific purpose though Neil's database backed upload stash could be used probably.
I don't have time to work on extensive projects though, so unless somebody else is willing to pick it up I think it should be removed from MediaWiki all together.