I mean, of course you should prefer to scan millions of rows instead of sorting zero, right?
Where is "millions" coming from?
flaggedpages WHERE fp_pending_since = NULL shouldn't require scanning anything close to millions (if the index is used), nor should scanning the flaggedpages_pending table.
I suppose one could pathologically imagine millions of pages needing review, but I was more concerned about what mysql does when "in category X" and (less so) "on my watchlist" JOINs are tossed in.
your thought is based on assumption that community is not a bunch of slackers and review changes immediately.
and yes, there're >100k pages needing review, and exactly "in category" and "on my watchlist" queries were super-expensive
Some wikis have 100k pages, I know. I've looked at that stats before :)
I didn't want the category/watchlist case to filesort to get the timestamp ordering right.
filesorting hundred or thousand rows is cheaper than scanning hundreds of thousands of rows - we do filesorts for watchlists already, rather than scanning full recentchanges
Gah,I meant "WHERE fp_pending_since IS NOT NULL" :)