r29292 MediaWiki - Code Review archive

Repository:	MediaWiki
Revision:	< r29291‎ \| r29292 \| r29293 >
Date:	12:39, 5 January 2008
Author:	tstarling
Status:	old
Tags:
Comment:	* Merged comment handling with the main loop of preprocessToDom(). This fixes a section numbering/marking regression introduced in r28588. Added parser tests demonstrating the issue. * Merged includeonly/noinclude/onlyinclude handling with preprocessToDom(), and used the resulting überparser to fix another section numbering bug: bug 6563. The fix involves putting a template flag "T" into the section parameter of edit links. This flag indicates to extractSections() how <includeonly> etc. should be handled. If these two changes stick, I'll eventually describe the precise syntactic effects in RELEASE-NOTES. * Added splitExtNode() for future use in LabeledSectionTransclusion. * Added parser tests for bug 6563.
Modified paths:	/trunk/phase3/includes/Parser.php (modified) (history) /trunk/phase3/maintenance/parserTests.txt (modified) (history)

Diff [purge]

Index: trunk/phase3/maintenance/parserTests.txt
—	—	@@ -2575,6 +2575,64 @@
2576	2576	</p>
2577	2577	!! end
2578	2578
	2579	+!! article
	2580	+Template:Includeonly section
	2581	+!! text
	2582	+<includeonly>
	2583	+==Includeonly section==
	2584	+</includeonly>
	2585	+==Section T-1==
	2586	+!!endarticle
	2587	+
	2588	+!! test
	2589	+Bug 6563: Edit link generation for section shown by <includeonly>
	2590	+!! input
	2591	+{{includeonly section}}
	2592	+!! result
	2593	+<a name="Includeonly_section"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Template:Includeonly_section&action=edit&section=T-1" title="Template:Includeonly section">edit</a>]</span> <span class="mw-headline">Includeonly section</span></h2>
	2594	+<a name="Section_T-1"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Template:Includeonly_section&action=edit&section=T-2" title="Template:Includeonly section">edit</a>]</span> <span class="mw-headline">Section T-1</span></h2>
	2595	+
	2596	+!! end
	2597	+
	2598	+# Uses same input as the contents of [[Template:Includeonly section]]
	2599	+!! test
	2600	+Bug 6563: Section extraction for section shown by <includeonly>
	2601	+!! options
	2602	+section=T-2
	2603	+!! input
	2604	+<includeonly>
	2605	+==Includeonly section==
	2606	+</includeonly>
	2607	+==Section T-2==
	2608	+!! result
	2609	+==Section T-2==
	2610	+!! end
	2611	+
	2612	+!! test
	2613	+Bug 6563: Edit link generation for section suppressed by <includeonly>
	2614	+!! input
	2615	+<includeonly>
	2616	+==Includeonly section==
	2617	+</includeonly>
	2618	+==Section 1==
	2619	+!! result
	2620	+<a name="Section_1"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Parser_test&action=edit&section=1" title="Edit section: Section 1">edit</a>]</span> <span class="mw-headline">Section 1</span></h2>
	2621	+
	2622	+!! end
	2623	+
	2624	+!! test
	2625	+Bug 6563: Section extraction for section suppressed by <includeonly>
	2626	+!! options
	2627	+section=1
	2628	+!! input
	2629	+<includeonly>
	2630	+==Includeonly section==
	2631	+</includeonly>
	2632	+==Section 1==
	2633	+!! result
	2634	+==Section 1==
	2635	+!! end
	2636	+
2579	2637	###
2580	2638	### Pre-save transform tests
2581	2639	###
—	—	@@ -3504,8 +3562,8 @@
3505	3563	==Section 4==
3506	3564	!! result
3507	3565	<a name="Section_0"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Parser_test&action=edit&section=1" title="Edit section: Section 0">edit</a>]</span> <span class="mw-headline">Section 0</span></h2>
3508		-<a name="Section_1"></a><h3><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Template:Sections&action=edit&section=1" title="Template:Sections">edit</a>]</span> <span class="mw-headline">Section 1</span></h3>
3509		-<a name="Section_2"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Template:Sections&action=edit&section=2" title="Template:Sections">edit</a>]</span> <span class="mw-headline">Section 2</span></h2>
	3566	+<a name="Section_1"></a><h3><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Template:Sections&action=edit&section=T-1" title="Template:Sections">edit</a>]</span> <span class="mw-headline">Section 1</span></h3>
	3567	+<a name="Section_2"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Template:Sections&action=edit&section=T-2" title="Template:Sections">edit</a>]</span> <span class="mw-headline">Section 2</span></h2>
3510	3568	<a name="Section_4"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=Parser_test&action=edit&section=2" title="Edit section: Section 4">edit</a>]</span> <span class="mw-headline">Section 4</span></h2>
3511	3569
3512	3570	!! end
—	—	@@ -6223,7 +6281,7 @@
6224	6282	!! input
6225	6283	{{MediaWiki:Fake}}
6226	6284	!! result
6227		-<a name="header"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=MediaWiki:Fake&action=edit&section=1" title="MediaWiki:Fake">edit</a>]</span> <span class="mw-headline">header</span></h2>
	6285	+<a name="header"></a><h2><span class="editsection">[<a href="https://www.mediawiki.org/index.php?title=MediaWiki:Fake&action=edit&section=T-1" title="MediaWiki:Fake">edit</a>]</span> <span class="mw-headline">header</span></h2>
6228	6286
6229	6287	!! end
6230	6288
Index: trunk/phase3/includes/Parser.php
—	—	@@ -71,6 +71,9 @@
72	72	const COLON_STATE_COMMENTDASH = 6;
73	73	const COLON_STATE_COMMENTDASHDASH = 7;
74	74
	75	+ // Flags for preprocessToDom
	76	+ const PTD_FOR_INCLUSION = 1;
	77	+
75	78	/**#@+
76	79	* @private
77	80	*/
—	—	@@ -928,11 +931,6 @@
929	932	return $text ;
930	933	}
931	934
932		~~- # Remove <noinclude> tags and <includeonly> sections~~
933		~~- $text = strtr( $text, array( '<onlyinclude>' => '' , '</onlyinclude>' => '' ) );~~
934		~~- $text = strtr( $text, array( '<noinclude>' => '', '</noinclude>' => '') );~~
935		~~- $text = StringUtils::delimiterReplace( '<includeonly>', '</includeonly>', '', $text );~~
936		-
937	935	$text = $this->replaceVariables( $text );
938	936	$text = Sanitizer::removeHTMLtags( $text, array( &$this, 'attributeStripCallback' ), false, array_keys( $this->mTransparentTagHooks ) );
939	937	wfRunHooks( 'InternalParseBeforeLinks', array( &$this, &$text, &$this->mStripState ) );
—	—	@@ -2541,17 +2539,32 @@
2542	2540	}
2543	2541
2544	2542	/**
2545		~~- * Parse any parentheses in format ((title\|part\|part)} and return the document tree~~
	2543	+ * Preprocess some wikitext and return the document tree.
2546	2544	* This is the ghost of replace_variables().
2547	2545	*
2548	2546	* @param string $text The text to parse
	2547	+ * @param integer flags Bitwise combination of:
	2548	+ * self::PTD_FOR_INCLUSION Handle <noinclude>/<includeonly> as if the text is being
	2549	+ * included. Default is to assume a direct page view.
	2550	+ *
	2551	+ * The generated DOM tree must depend only on the input text, the flags, and $this->ot['msg'].
	2552	+ * The DOM tree must be the same in OT_HTML and OT_WIKI mode, to avoid a regression of bug 4899.
	2553	+ *
	2554	+ * Any flag added to the $flags parameter here, or any other parameter liable to cause a
	2555	+ * change in the DOM tree for a given text, must be passed through the section identifier
	2556	+ * in the section edit link and thus back to extractSections().
	2557	+ *
	2558	+ * The output of this function is currently only cached in process memory, but a persistent
	2559	+ * cache may be implemented at a later date which takes further advantage of these strict
	2560	+ * dependency requirements.
	2561	+ *
2549	2562	* @private
2550	2563	*/
2551		~~- function preprocessToDom ( $text ) {~~
	2564	+ function preprocessToDom ( $text, $flags = 0 ) {
2552	2565	wfProfileIn( __METHOD__ );
2553	2566	wfProfileIn( __METHOD__.'-makexml' );
2554	2567
2555		~~- static $msgRules, $normalRules;~~
	2568	+ static $msgRules, $normalRules, $inclusionSupertags, $nonInclusionSupertags;
2556	2569	if ( !$msgRules ) {
2557	2570	$msgRules = array(
2558	2571	'{' => array(
—	—	@@ -2592,19 +2605,32 @@
2593	2606	} else {
2594	2607	$rules = $normalRules;
2595	2608	}
	2609	+ $forInclusion = $flags & self::PTD_FOR_INCLUSION;
2596	2610
2597		~~- if ( $this->ot['html'] \|\| ( $this->ot['pre'] && $this->mOptions->getRemoveComments() ) ) {~~
2598		~~- $text = Sanitizer::removeHTMLcomments( $text );~~
	2611	+ $xmlishElements = $this->getStripList();
	2612	+ $enableOnlyinclude = false;
	2613	+ if ( $forInclusion ) {
	2614	+ $ignoredTags = array( 'includeonly', '/includeonly' );
	2615	+ $ignoredElements = array( 'noinclude' );
	2616	+ $xmlishElements[] = 'noinclude';
	2617	+ if ( strpos( $text, '<onlyinclude>' ) !== false && strpos( $text, '</onlyinclude>' ) !== false ) {
	2618	+ $enableOnlyinclude = true;
	2619	+ }
	2620	+ } else {
	2621	+ $ignoredTags = array( 'noinclude', '/noinclude', 'onlyinclude', '/onlyinclude' );
	2622	+ $ignoredElements = array( 'includeonly' );
	2623	+ $xmlishElements[] = 'includeonly';
2599	2624	}
	2625	+ $xmlishRegex = implode( '\|', array_merge( $xmlishElements, $ignoredTags ) );
2600	2626
2601		~~- $extElements = implode( '\|', $this->getStripList() );~~
2602	2627	// Use "A" modifier (anchored) instead of "^", because ^ doesn't work with an offset
2603		~~- $extElementsRegex = "/($extElements)(?:\s\|\/>\|>)\|(!--)/iA";~~
	2628	+ $elementsRegex = "~($xmlishRegex)(?:\s\|\/>\|>)\|(!--)~iA";
2604	2629
2605	2630	$stack = array(); # Stack of unclosed parentheses
2606	2631	$stackIndex = -1; # Stack read pointer
2607	2632
2608	2633	$searchBase = implode( '', array_keys( $rules ) ) . '<';
	2634	+ $revText = strrev( $text ); // For fast reverse searches
2609	2635
2610	2636	$i = -1; # Input pointer, starts out pointing to a pseudo-newline before the start
2611	2637	$topAccum = '<root>'; # Top level text accumulator
—	—	@@ -2614,8 +2640,27 @@
2615	2641	$findPipe = false; # True to take notice of pipe characters
2616	2642	$headingIndex = 1;
2617	2643	$noMoreGT = false; # True if there are no more greater-than (>) signs right of $i
	2644	+ $findOnlyinclude = $enableOnlyinclude; # True to ignore all input up to the next <onlyinclude>
2618	2645
2619		~~- while ( $i < strlen( $text ) ) {~~
	2646	+ if ( $enableOnlyinclude ) {
	2647	+ $i = 0;
	2648	+ }
	2649	+
	2650	+ while ( true ) {
	2651	+ if ( $findOnlyinclude ) {
	2652	+ // Ignore all input up to the next <onlyinclude>
	2653	+ $startPos = strpos( $text, '<onlyinclude>', $i );
	2654	+ if ( $startPos === false ) {
	2655	+ // Ignored section runs to the end
	2656	+ $accum .= '<ignore>' . htmlspecialchars( substr( $text, $i ) ) . '</ignore>';
	2657	+ break;
	2658	+ }
	2659	+ $tagEndPos = $startPos + strlen( '<onlyinclude>' ); // past-the-end
	2660	+ $accum .= '<ignore>' . htmlspecialchars( substr( $text, $i, $tagEndPos - $i ) ) . '</ignore>';
	2661	+ $i = $tagEndPos;
	2662	+ $findOnlyinclude = false;
	2663	+ }
	2664	+
2620	2665	if ( $i == -1 ) {
2621	2666	$found = 'line-start';
2622	2667	$curChar = '';
—	—	@@ -2684,8 +2729,14 @@
2685	2730
2686	2731	if ( $found == 'angle' ) {
2687	2732	$matches = false;
	2733	+ // Handle </onlyinclude>
	2734	+ if ( $enableOnlyinclude && substr( $text, $i, strlen( '</onlyinclude>' ) ) == '</onlyinclude>' ) {
	2735	+ $findOnlyinclude = true;
	2736	+ continue;
	2737	+ }
	2738	+
2688	2739	// Determine element name
2689		~~- if ( !preg_match( $extElementsRegex, $text, $matches, 0, $i + 1 ) ) {~~
	2740	+ if ( !preg_match( $elementsRegex, $text, $matches, 0, $i + 1 ) ) {
2690	2741	// Element name missing or not listed
2691	2742	$accum .= '<';
2692	2743	++$i;
—	—	@@ -2693,21 +2744,37 @@
2694	2745	}
2695	2746	// Handle comments
2696	2747	if ( isset( $matches[2] ) && $matches[2] == '!--' ) {
2697		~~- // HTML comment, scan to end~~
2698		~~- $endpos = strpos( $text, '-->', $i + 4 );~~
2699		~~- if ( $endpos === false ) {~~
	2748	+ // To avoid leaving blank lines, when a comment is both preceded
	2749	+ // and followed by a newline (ignoring spaces), trim leading and
	2750	+ // trailing spaces and one of the newlines.
	2751	+
	2752	+ // Find the end
	2753	+ $endPos = strpos( $text, '-->', $i + 4 );
	2754	+ if ( $endPos === false ) {
2700	2755	// Unclosed comment in input, runs to end
2701	2756	$inner = substr( $text, $i );
2702		~~- if ( $this->ot['html'] ) {~~
2703		~~- // Close it so later stripping can remove it~~
2704		~~- $inner .= '-->';~~
2705		~~- }~~
2706	2757	$accum .= '<comment>' . htmlspecialchars( $inner ) . '</comment>';
2707	2758	$i = strlen( $text );
2708	2759	} else {
2709		~~- $inner = substr( $text, $i, $endpos - $i + 3 );~~
	2760	+ // Search backwards for leading whitespace
	2761	+ $wsStart = $i ? ( $i - strspn( $revText, ' ', strlen( $text ) - $i - 1 ) ) : 0;
	2762	+ // Search forwards for trailing whitespace
	2763	+ // $wsEnd will be the position of the last space
	2764	+ $wsEnd = $endPos + 2 + strspn( $text, ' ', $endPos + 3 );
	2765	+ // Eat the line if possible
	2766	+ if ( $wsStart > 0 && substr( $text, $wsStart - 1, 1 ) == "\n"
	2767	+ && substr( $text, $wsEnd + 1, 1 ) == "\n" )
	2768	+ {
	2769	+ $startPos = $wsStart;
	2770	+ $endPos = $wsEnd + 1;
	2771	+ } else {
	2772	+ // No line to eat, just take the comment itself
	2773	+ $startPos = $i;
	2774	+ $endPos += 2;
	2775	+ }
	2776	+ $inner = substr( $text, $startPos, $endPos - $startPos + 1 );
2710	2777	$accum .= '<comment>' . htmlspecialchars( $inner ) . '</comment>';
2711		~~- $i = $endpos + 3;~~
	2778	+ $i = $endPos + 1;
2712	2779	}
2713	2780	continue;
2714	2781	}
—	—	@@ -2724,6 +2791,15 @@
2725	2792	++$i;
2726	2793	continue;
2727	2794	}
	2795	+
	2796	+ // Handle ignored tags
	2797	+ if ( in_array( $name, $ignoredTags ) ) {
	2798	+ $accum .= '<ignore>' . htmlspecialchars( substr( $text, $i, $tagEndPos - $i + 1 ) ) . '</ignore>';
	2799	+ $i = $tagEndPos + 1;
	2800	+ continue;
	2801	+ }
	2802	+
	2803	+ $tagStartPos = $i;
2728	2804	if ( $text[$tagEndPos-1] == '/' ) {
2729	2805	$attrEnd = $tagEndPos - 1;
2730	2806	$inner = null;
—	—	@@ -2743,6 +2819,13 @@
2744	2820	$close = '';
2745	2821	}
2746	2822	}
	2823	+ // <includeonly> and <noinclude> just become <ignore> tags
	2824	+ if ( in_array( $name, $ignoredElements ) ) {
	2825	+ $accum .= '<ignore>' . htmlspecialchars( substr( $text, $tagStartPos, $i - $tagStartPos ) )
	2826	+ . '</ignore>';
	2827	+ continue;
	2828	+ }
	2829	+
2747	2830	$accum .= '<ext>';
2748	2831	if ( $attrEnd <= $attrStart ) {
2749	2832	$attr = '';
—	—	@@ -2784,13 +2867,11 @@
2785	2868	// A heading must be open, otherwise \n wouldn't have been in the search list
2786	2869	assert( $piece['open'] == "\n" );
2787	2870	assert( $stackIndex == 0 );
2788		~~- // Search back through the accumulator to see if it has a proper close~~
2789		~~- // No efficient way to do this in PHP AFAICT: strrev, PCRE search with $ anchor~~
2790		~~- // and rtrim are all O(N) in total size. Optimal would be O(N) in trailing~~
2791		~~- // whitespace size only.~~
	2871	+ // Search back through the input to see if it has a proper close
	2872	+ // Do this using the reversed string since the other solutions (end anchor, etc.) are inefficient
2792	2873	$m = false;
2793	2874	$count = $piece['count'];
2794		~~- if ( preg_match( "/(={{$count}})\s*$/", $accum, $m, 0, $count ) ) {~~
	2875	+ if ( preg_match( "/\s*(={{$count}})/A", $revText, $m, 0, strlen( $text ) - $i ) ) {
2795	2876	// Found match, output <h>
2796	2877	$count = min( strlen( $m[1] ), $count );
2797	2878	$element = "<h level=\"$count\" i=\"$headingIndex\">$accum</h>";
—	—	@@ -3022,27 +3103,6 @@
3023	3104	}
3024	3105
3025	3106	/**
3026		~~- * Convert text to a document tree, like preprocessToDom(), but with some special handling~~
3027		~~- * assuming the source text is from a template -- specifically noinclude/includeonly behaviour.~~
3028		~~- */~~
3029		~~- function preprocessTplToDom( $text ) {~~
3030		~~- # If there are any <onlyinclude> tags, only include them~~
3031		~~- if ( !$this->ot['msg'] ) {~~
3032		~~- if ( in_string( '<onlyinclude>', $text ) && in_string( '</onlyinclude>', $text ) ) {~~
3033		~~- $replacer = new OnlyIncludeReplacer;~~
3034		~~- StringUtils::delimiterReplaceCallback( '<onlyinclude>', '</onlyinclude>',~~
3035		~~- array( &$replacer, 'replace' ), $text );~~
3036		~~- $text = $replacer->output;~~
3037		~~- }~~
3038		~~- # Remove <noinclude> sections and <includeonly> tags~~
3039		~~- $text = StringUtils::delimiterReplace( '<noinclude>', '</noinclude>', '', $text );~~
3040		~~- $text = strtr( $text, array( '<includeonly>' => '' , '</includeonly>' => '' ) );~~
3041		~~- }~~
3042		-
3043		~~- return $this->preprocessToDom( $text );~~
3044		~~- }~~
3045		-
3046		- /**
3047	3107	* Replace magic variables, templates, and template arguments
3048	3108	* with the appropriate text. Templates are substituted recursively,
3049	3109	* taking care to avoid infinite loops.
—	—	@@ -3311,7 +3371,7 @@
3312	3372	} else {
3313	3373	$text = $this->interwikiTransclude( $title, 'raw' );
3314	3374	// Preprocess it like a template
3315		~~- $text = $this->preprocessTplToDom( $text );~~
	3375	+ $text = $this->preprocessToDom( $text, self::PTD_FOR_INCLUSION );
3316	3376	$isDOM = true;
3317	3377	}
3318	3378	$found = true;
—	—	@@ -3404,7 +3464,7 @@
3405	3465	return array( false, $title );
3406	3466	}
3407	3467
3408		~~- $dom = $this->preprocessTplToDom( $text );~~
	3468	+ $dom = $this->preprocessToDom( $text, self::PTD_FOR_INCLUSION );
3409	3469	$this->mTplDomCache[ $titleText ] = $dom;
3410	3470
3411	3471	if (! $title->equals($cacheTitle)) {
—	—	@@ -3906,10 +3966,13 @@
3907	3967	}
3908	3968	# give headline the correct <h#> tag
3909	3969	if( $showEditLink && $sectionIndex !== false ) {
3910		~~- if( $isTemplate )~~
3911		~~- $editlink = $sk->editSectionLinkForOther($titleText, $sectionIndex);~~
3912		~~- else~~
	3970	+ if( $isTemplate ) {
	3971	+ # Put a T flag in the section identifier, to indicate to extractSections()
	3972	+ # that sections inside <includeonly> should be counted.
	3973	+ $editlink = $sk->editSectionLinkForOther($titleText, "T-$sectionIndex");
	3974	+ } else {
3913	3975	$editlink = $sk->editSectionLink($this->mTitle, $sectionIndex, $headlineHint);
	3976	+ }
3914	3977	} else {
3915	3978	$editlink = '';
3916	3979	}
—	—	@@ -4910,14 +4973,22 @@
4911	4974	*
4912	4975	* External callers should use the getSection and replaceSection methods.
4913	4976	*
4914		~~- * @param $text Page wikitext~~
4915		~~- * @param $section Numbered section. 0 pulls the text before the first~~
4916		~~- * heading; other numbers will pull the given section~~
4917		~~- * along with its lower-level subsections. If the section is~~
4918		~~- * not found, $mode=get will return $newtext, and~~
4919		~~- * $mode=replace will return $text.~~
4920		~~- * @param $mode One of "get" or "replace"~~
4921		~~- * @param $newText Replacement text for section data.~~
	4977	+ * @param string $text Page wikitext
	4978	+ * @param string $section A section identifier string of the form:
	4979	+ * <flag1> - <flag2> - ... - <section number>
	4980	+ *
	4981	+ * Currently the only recognised flag is "T", which means the target section number
	4982	+ * was derived during a template inclusion parse, in other words this is a template
	4983	+ * section edit link. If no flags are given, it was an ordinary section edit link.
	4984	+ * This flag is required to avoid a section numbering mismatch when a section is
	4985	+ * enclosed by <includeonly> (bug 6563).
	4986	+ *
	4987	+ * The section number 0 pulls the text before the first heading; other numbers will
	4988	+ * pull the given section along with its lower-level subsections. If the section is
	4989	+ * not found, $mode=get will return $newtext, and $mode=replace will return $text.
	4990	+ *
	4991	+ * @param string $mode One of "get" or "replace"
	4992	+ * @param string $newText Replacement text for section data.
4922	4993	* @return string for "get", the extracted section text.
4923	4994	* for "replace", the whole page with the section replaced.
4924	4995	*/
—	—	@@ -4931,8 +5002,17 @@
4932	5003	$outText = '';
4933	5004	$frame = new PPFrame( $this );
4934	5005
	5006	+ // Process section extraction flags
	5007	+ $flags = 0;
	5008	+ $sectionParts = explode( '-', $section );
	5009	+ $sectionIndex = array_pop( $sectionParts );
	5010	+ foreach ( $sectionParts as $part ) {
	5011	+ if ( $part == 'T' ) {
	5012	+ $flags \|= self::PTD_FOR_INCLUSION;
	5013	+ }
	5014	+ }
4935	5015	// Preprocess the text
4936		~~- $dom = $this->preprocessToDom( $text );~~
	5016	+ $dom = $this->preprocessToDom( $text, $flags );
4937	5017	$root = $dom->documentElement;
4938	5018
4939	5019	// <h> nodes indicate section breaks
—	—	@@ -4940,13 +5020,13 @@
4941	5021	$node = $root->firstChild;
4942	5022
4943	5023	// Find the target section
4944		~~- if ( $section == 0 ) {~~
	5024	+ if ( $sectionIndex == 0 ) {
4945	5025	// Section zero doesn't nest, level=big
4946	5026	$targetLevel = 1000;
4947	5027	} else {
4948	5028	while ( $node ) {
4949	5029	if ( $node->nodeName == 'h' ) {
4950		~~- if ( $curIndex + 1 == $section ) {~~
	5030	+ if ( $curIndex + 1 == $sectionIndex ) {
4951	5031	break;
4952	5032	}
4953	5033	$curIndex++;
—	—	@@ -4975,7 +5055,7 @@
4976	5056	if ( $node->nodeName == 'h' ) {
4977	5057	$curIndex++;
4978	5058	$curLevel = $node->getAttribute( 'level' );
4979		~~- if ( $curIndex != $section && $curLevel <= $targetLevel ) {~~
	5059	+ if ( $curIndex != $sectionIndex && $curLevel <= $targetLevel ) {
4980	5060	break;
4981	5061	}
4982	5062	}
—	—	@@ -5012,9 +5092,9 @@
5013	5093	*
5014	5094	* If a section contains subsections, these are also returned.
5015	5095	*
5016		~~- * @param $text String: text to look in~~
5017		~~- * @param $section Integer: section number~~
5018		~~- * @param $deftext: default to return if section is not found~~
	5096	+ * @param string $text text to look in
	5097	+ * @param string $section section identifier
	5098	+ * @param string $deftext default to return if section is not found
5019	5099	* @return string text of the requested section
5020	5100	*/
5021	5101	public function getSection( $text, $section, $deftext='' ) {
—	—	@@ -5217,8 +5297,9 @@
5218	5298	const NO_ARGS = 1;
5219	5299	const NO_TEMPLATES = 2;
5220	5300	const STRIP_COMMENTS = 4;
	5301	+ const NO_IGNORE = 8;
5221	5302
5222		~~- const RECOVER_ORIG = 3;~~
	5303	+ const RECOVER_ORIG = 11;
5223	5304
5224	5305	/**
5225	5306	* Construct a new preprocessor frame.
—	—	@@ -5323,11 +5404,24 @@
5324	5405	}
5325	5406	} elseif ( $root->nodeName == 'comment' ) {
5326	5407	# HTML-style comment
5327		~~- if ( $flags & self::STRIP_COMMENTS ) {~~
	5408	+ if ( $this->parser->ot['html']
	5409	+ \|\| ( $this->parser->ot['pre'] && $this->mOptions->getRemoveComments() )
	5410	+ \|\| ( $flags & self::STRIP_COMMENTS ) )
	5411	+ {
5328	5412	$s = '';
5329	5413	} else {
5330	5414	$s = $root->textContent;
5331	5415	}
	5416	+ } elseif ( $root->nodeName == 'ignore' ) {
	5417	+ # Output suppression used by <includeonly> etc.
	5418	+ # OT_WIKI will only respect <ignore> in substed templates.
	5419	+ # The other output types respect it unless NO_IGNORE is set.
	5420	+ # extractSections() sets NO_IGNORE and so never respects it.
	5421	+ if ( ( !isset( $this->parent ) && $this->parser->ot['wiki'] ) \|\| ( $flags & self::NO_IGNORE ) ) {
	5422	+ $s = $root->textContent;
	5423	+ } else {
	5424	+ $s = '';
	5425	+ }
5332	5426	} elseif ( $root->nodeName == 'ext' ) {
5333	5427	# Extension tag
5334	5428	$xpath = new DOMXPath( $root->ownerDocument );
—	—	@@ -5417,6 +5511,31 @@
5418	5512	return array( $name, $index, $values->item( 0 ) );
5419	5513	}
5420	5514
	5515	+ /**
	5516	+ * Split an <ext> node into an associative array containing name, attr, inner and close
	5517	+ * All values in the resulting array are DOMNodes. Inner and close are optional.
	5518	+ */
	5519	+ function splitExtNode( $node ) {
	5520	+ $xpath = new DOMXPath( $node->ownerDocument );
	5521	+ $names = $xpath->query( 'name', $node );
	5522	+ $attrs = $xpath->query( 'attr', $node );
	5523	+ $inners = $xpath->query( 'inner', $node );
	5524	+ $closes = $xpath->query( 'close', $node );
	5525	+ if ( !$names->length \|\| !$attrs->length ) {
	5526	+ throw new MWException( 'Invalid ext node passed to ' . __METHOD__ );
	5527	+ }
	5528	+ $parts = array(
	5529	+ 'name' => $names->item( 0 ),
	5530	+ 'attr' => $attrs->item( 0 ) );
	5531	+ if ( $inners->length ) {
	5532	+ $parts['inner'] = $inners->item( 0 );
	5533	+ }
	5534	+ if ( $closes->length ) {
	5535	+ $parts['close'] = $closes->item( 0 );
	5536	+ }
	5537	+ return $parts;
	5538	+ }
	5539	+
5421	5540	function __toString() {
5422	5541	return 'frame{}';
5423	5542	}

Follow-up revisions

Revision	Commit summary	Author	Date
r29415	* Fixed handling of whitespace before HTML comments, slightly broken since r2......	tstarling	05:12, 8 January 2008
r38208	Revert r38196, r38204 -- "(bugs 6089, 13079) Show edit section links for tran...	brion	23:56, 29 July 2008

Past revisions this follows-up on

Revision	Commit summary	Author	Date
r28588	* Strip comments early, before template expansion. This mimics the behaviour ...	tstarling	15:07, 17 December 2007

Status & tagging log

15:23, 12 September 2011 Meno25 (talk | contribs) changed the status of r29292 [removed: ok added: old]