r68873 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r68872‎ | r68873 | r68874 >
Date:12:11, 2 July 2010
Author:hartman
Status:resolved (Comments)
Tags:
Comment:
(bug 24073) Recognize MS Office 2003 style files that have been saved by MS 2007.
These files have OPC trailers with 2007 specific information.
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/MimeMagic.php (modified) (history)

Diff [purge]

Index: trunk/phase3/RELEASE-NOTES
@@ -222,6 +222,7 @@
223223 * (bug 22784) Normalise underscores and spaces in autocomments.
224224 * (bug 19910) Headings of the form ===+\s+ are now displayed as valid headings
225225 * (bug 24022) Only check file extensions on the uploadpage when needed.
 226+* (bug 24076) Recognize Office 2003 files with OpenXML trailers
226227
227228 === API changes in 1.17 ===
228229 * (bug 22738) Allow filtering by action type on query=logevent.
Index: trunk/phase3/includes/MimeMagic.php
@@ -545,10 +545,10 @@
546546 }
547547 }
548548
549 - // Check for ZIP (before getimagesize)
 549+ // Check for ZIP variants (before getimagesize)
550550 if ( strpos( $tail, "PK\x05\x06" ) !== false ) {
551 - wfDebug( __METHOD__.": ZIP header present at end of $file\n" );
552 - return $this->detectZipType( $head, $ext );
 551+ wfDebug( __METHOD__.": ZIP header present in $file\n" );
 552+ return $this->detectZipType( $head, $tail, $ext );
553553 }
554554
555555 wfSuppressWarnings();
@@ -573,16 +573,17 @@
574574
575575 /**
576576 * Detect application-specific file type of a given ZIP file from its
577 - * header data. Currently works for OpenDocument types...
 577+ * header data. Currently works for OpenDocument and OpenXML types...
578578 * If can't tell, returns 'application/zip'.
579579 *
580580 * @param $header String: some reasonably-sized chunk of file header
 581+ * @param $tail String: the tail of the file
581582 * @param $ext Mixed: the file extension, or true to extract it from the filename.
582583 * Set it to false to ignore the extension.
583584 *
584585 * @return string
585586 */
586 - function detectZipType( $header, $ext = false ) {
 587+ function detectZipType( $header, $tail = null, $ext = false ) {
587588 $mime = 'application/zip';
588589 $opendocTypes = array(
589590 'chart-template',
@@ -605,14 +606,13 @@
606607 // http://lists.oasis-open.org/archives/office/200505/msg00006.html
607608 $types = '(?:' . implode( '|', $opendocTypes ) . ')';
608609 $opendocRegex = "/^mimetype(application\/vnd\.oasis\.opendocument\.$types)/";
609 - wfDebug( __METHOD__.": $opendocRegex\n" );
610610
611611 $openxmlRegex = "/^\[Content_Types\].xml/";
612 -
 612+
613613 if( preg_match( $opendocRegex, substr( $header, 30 ), $matches ) ) {
614614 $mime = $matches[1];
615615 wfDebug( __METHOD__.": detected $mime from ZIP archive\n" );
616 - } elseif( preg_match( $openxmlRegex, substr( $header, 30 ), $matches ) ) {
 616+ } elseif( preg_match( $openxmlRegex, substr( $header, 30 ) ) ) {
617617 $mime = "application/x-opc+zip";
618618 if( $ext !== true && $ext !== false ) {
619619 /** This is the mode used by getPropsFromPath
@@ -624,11 +624,39 @@
625625 * find the proper mime type for that file extension */
626626 $mime = $this->guessTypesForExtension( $ext );
627627 } else {
628 - $mime = 'application/zip';
 628+ $mime = "application/zip";
629629 }
630 -
631630 }
632631 wfDebug( __METHOD__.": detected an Open Packaging Conventions archive: $mime\n" );
 632+ } else if( substr( $header, 0, 8 ) == "\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1" &&
 633+ ($headerpos = strpos( $tail, "PK\x03\x04" ) ) !== false &&
 634+ preg_match( $openxmlRegex, substr( $tail, $headerpos + 30 ) ) ) {
 635+ if( substr( $header, 512, 4) == "\xEC\xA5\xC1\x00" ) {
 636+ $mime = "application/msword";
 637+ }
 638+ switch( substr( $header, 512, 6) ) {
 639+ case "\xEC\xA5\xC1\x00\x0E\x00":
 640+ case "\xEC\xA5\xC1\x00\x1C\x00":
 641+ case "\xEC\xA5\xC1\x00\x43\x00":
 642+ $mime = "application/vnd.ms-powerpoint";
 643+ break;
 644+ case "\xFD\xFF\xFF\xFF\x10\x00":
 645+ case "\xFD\xFF\xFF\xFF\x1F\x00":
 646+ case "\xFD\xFF\xFF\xFF\x22\x00":
 647+ case "\xFD\xFF\xFF\xFF\x23\x00":
 648+ case "\xFD\xFF\xFF\xFF\x28\x00":
 649+ case "\xFD\xFF\xFF\xFF\x29\x00":
 650+ case "\xFD\xFF\xFF\xFF\x10\x02":
 651+ case "\xFD\xFF\xFF\xFF\x1F\x02":
 652+ case "\xFD\xFF\xFF\xFF\x22\x02":
 653+ case "\xFD\xFF\xFF\xFF\x23\x02":
 654+ case "\xFD\xFF\xFF\xFF\x28\x02":
 655+ case "\xFD\xFF\xFF\xFF\x29\x02":
 656+ $mime = "application/vnd.msexcel";
 657+ break;
 658+ }
 659+
 660+ wfDebug( __METHOD__.": detected a MS Office document with OPC trailer\n");
633661 } else {
634662 wfDebug( __METHOD__.": unable to identify type of ZIP archive\n" );
635663 }

Follow-up revisions

RevisionCommit summaryAuthorDate
r81376Blacklist ZIP subtypes added in r68873, to avoid GIFAR.tstarling05:35, 2 February 2011

Comments

#Comment by Platonides (talk | contribs)   15:00, 3 July 2010

$openxmlRegex should use strpos, not preg_match (I suspect the 30 is a lazy number, too)

I'd prefer $mime getting the value application/zip due to being set in a default branch instead of just keeping the value set at the beginning.

#Comment by Tim Starling (talk | contribs)   05:38, 2 February 2011

The problem with this is that it requires application/msword, application/vnd.ms-powerpoint and application/vnd.msexcel to be blacklisted. I have done so in r81376. What we really need is a proper solution for the GIFAR vulnerability, which doesn't create incentives for insecurity by requiring an insecure configuration to enable useful features like MS Office uploads.

Status & tagging log