Extension:SyntaxHighlight GeSHi remote

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual - list
Crystal Clear action run.png
SyntaxHighlight_GeSHi_remote

Release status: beta

Implementation Tag
Description Allows source code from remote URLs to be embedded and syntax highlighted on wiki pages
Author(s) Sascha
License Apache License 2.0
Download
Example <syntaxhighlight lang="cpp" url>http://mydomain.org/source.cpp</syntaxhighlight>
Tags
<syntaxhighlight>

Check usage (experimental)

Contents

[edit] What can this extension do?

This extension extends Extension:SyntaxHighlight_GeSHi to support highlighting remote files. It allows embedding the whole file or only extracting parts of it based on line numbers and special markers inside the remote file.

Unlike similar extensions, this one disallows the inclusion of local files to avoid security issues. Only URLs with a non-empty scheme, which does not match 'file', are allowed.

[edit] Usage

The provided patch adds four parameters to the <syntaxhighlight> tag from Extension:SyntaxHighlight_GeSHi.

url
Specifies that the content of the syntaxhighlight tag is a URL pointing to a source file
filterlines
Specifies line ranges which should be extracted from the source file, e.g. filterlines="4-19,45,60-120"
filtersections
Specifies section titles which should be extracted from the source file. Sections are started in the source file via a special token (see sectionmarker attribute below) followed by the section title. Sections end if a new section starts or if a sectionmarker followed by white-space only is found. For example filtersections="Step 1, Step 5". Section titles can be Perl compatible regular expressions, delimited by a comma.
sectionmarker
Specifies a Perl compatible regular expression which matches a section marker in the source file. The default is sectionmarker="|\s*//>|" which is a C-style comment with followed directly by a > sign.

[edit] Examples

Suppose the following file is accessible via the URL http://mydomain.org/source.cpp

  1. //> Comment
    
  2. /**
    
  3.  * This tutorial code explains how to print the numbers 1 to 1000 to a console
    
  4.  * without using loop or conditional constructs.
    
  5.  */
    
  6. //>
    
  7. #include <stdio.h>
    
  8. #include <stdlib.h>
    
  9. #include <iostream.h>
    
  10.  
    
  11. //> Too clever
    
  12. int main()
    
  13. {
    
  14.   printf("the numbers 1 to 1000\n");
    
  15. }
    
  16.  
    
  17. //> TMP
    
  18. template<int N>
    
  19. struct NumberGeneration {
    
  20.   static void out(std::ostream& os)
    
  21.   {
    
  22.     NumberGeneration<N-1>::out(os);
    
  23.     os << N << std::endl;
    
  24.   }
    
  25. };
    
  26. template<>
    
  27. struct NumberGeneration<1> {
    
  28.   static void out(std::ostream& os)
    
  29.   {
    
  30.     os << 1 << std::endl;
    
  31.   }
    
  32. };
    
  33.  
    
  34. int main() {
    
  35.    NumberGeneration<1000>::out(std::cout);
    
  36. }
    
  37.  
    
  38. //> Recursion
    
  39. int print1000(int num=1) {
    
  40.   printf("%d\n", num);
    
  41.   return (num < 1000 && print1000(++num);
    
  42. }
    
  43.  
    
  44. void main() {
    
  45.   print1000();
    
  46. }
    
  47.  
    
  48. //> FPA
    
  49. int main(int j) {
    
  50.   printf("%d\n", j);
    
  51.   (main + (exit - main)*(j/1000))(j+1);
    
  52. }
    
  53.  
    
  54. //> Favorite
    
  55. // The problem never required the algorithm to terminate
    
  56. void main(int n) {
    
  57.   printf("%d\n", n);
    
  58.   main(n+1);
    
  59. }
    

Then, <syntaxhighlight filterlines="7-10" filtersections="Comment,Too clever,Favorite" url>http://mydomain.org/source.cpp</syntaxhighlight> shows:

/**
 * This tutorial code explains how to print the numbers 1 to 1000 to a console
 * without using loop or conditional constructs.
 */
#include <stdio.h>
#include <stdlib.h>
#include <iostream.h>
 
int main()
{
  printf("the numbers 1 to 1000\n");
}
 
// The problem never required the algorithm to terminate
void main(int n) {
  printf("%d\n", n);
  main(n+1);
}

[edit] Limitations

  • No nesting of sections
  • Section markers must be the same inside one source file
  • The MediaWiki cache will prevent modified remote content to show up on the wiki page. You can work around this the following way:
    • Use Extension:MagicNoCache and add the magic word __NOCACHE__ to your page
    • Create a link which purges the cache for a given page:
      [{{fullurl:{{FULLPAGENAME}}}}?action=purge {{FULLPAGENAME}}]
      

[edit] Download instructions

Please save the code found below in a file (for example ~/SyntaxHighlight_GeSHi_remote.patch).

[edit] Installation

These installation instructions assume that you have already installed Extension:SyntaxHighlight_GeSHi.

Open a terminal and cd into $IP/extensions/SyntaxHighlight_GeSHi/. Note: $IP stands for the root directory of your MediaWiki installation, the same directory that holds LocalSettings.php. Then apply the patch to your SyntaxHighlight_GeSHi installation:

patch -p0 < ~/SyntaxHighlight_GeSHi_remote.patch

[edit] Code

Index: SyntaxHighlight_GeSHi.i18n.php
 
===================================================================
 
--- SyntaxHighlight_GeSHi.i18n.php      (revision 74418)
 
+++ SyntaxHighlight_GeSHi.i18n.php      (working copy)
 
@@ -12,12 +12,15 @@
 
  * @author Brion Vibber
 
  */
 
 $messages['en'] = array(
 
-       'syntaxhighlight-desc'         => 'Provides syntax highlighting <code>&lt;syntaxhighlight&gt;</code> using [http://qbnz.com/highlighter/ GeSHi - Generic Syntax Highlighter]',
 
+       'syntaxhighlight-desc'         => 'Provides syntax highlighting <code>&lt;syntaxhighlight&gt;</code> using [http://qbnz.com/highlighter/ GeSHi - Generic Syntax Highlighter], patched for remote file support',
 
        'syntaxhighlight-specify'      => 'You need to specify a language like this:',
 
        'syntaxhighlight-supported'    => 'Supported languages for syntax highlighting:',
 
        'syntaxhighlight-err-loading'  => '(error loading supported language list)',
 
        'syntaxhighlight-err-language' => 'Invalid language.',
 
        'geshi.css'                    => '/* CSS placed here will be applied to GeSHi syntax highlighting */',
 
+       'syntaxhighlight-err-urlscheme' => 'Invalid URL scheme: $1',
 
+       'syntaxhighlight-err-sectionmarker' => 'sectionmarker $1 must not contain an empty regular expression',
 
+       'syntaxhighlight-err-sectionmarker-missing' => 'The attribute sectionmarker is missing',
 
 );
 
 
 
 /** Message documentation (Message documentation)
 
Index: SyntaxHighlight_GeSHi.php
 
===================================================================
 
--- SyntaxHighlight_GeSHi.php   (revision 74418)
 
+++ SyntaxHighlight_GeSHi.php   (working copy)
 
@@ -43,7 +43,7 @@
 
 $wgExtensionCredits['parserhook']['SyntaxHighlight_GeSHi'] = array(
 
        'path'           => __FILE__,
 
        'name'           => 'SyntaxHighlight',
 
-       'author'         => array( 'Brion Vibber', 'Tim Starling', 'Rob Church', 'Niklas Laxström' ),
 
+       'author'         => array( 'Brion Vibber', 'Tim Starling', 'Rob Church', 'Niklas Laxström', 'patched by Sascha for URL support' ),
 
        'descriptionmsg' => 'syntaxhighlight-desc',
 
        'url'            => 'http://www.mediawiki.org/wiki/Extension:SyntaxHighlight_GeSHi',
 
 );
 
Index: SyntaxHighlight_GeSHi.class.php
 
===================================================================
 
--- SyntaxHighlight_GeSHi.class.php     (revision 74418)
 
+++ SyntaxHighlight_GeSHi.class.php     (working copy)
 
@@ -46,7 +46,42 @@
 
                        wfProfileOut( __METHOD__ );
 
                        return $error;
 
                }
 
-               $geshi = self::prepare( $text, $lang );
 
+        
 
+               // validate attributes for remote file support
 
+               $isurl = false;
 
+               if( isset( $args['url'] ) ) {
 
+                       $urlscheme = parse_url( $text, PHP_URL_SCHEME );
 
+                       if ( empty($urlscheme) || strcasecmp( $urlscheme, "file" ) == 0 ) {
 
+                               $error = self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-urlscheme', $urlscheme ) ) );
 
+                               wfProfileOut( __METHOD__ );
 
+                               return $error;
 
+                       }
 
+                       $isurl = true;
 
+               }
 
+               $sectionmarker = "|\s*//>|";
 
+               if( isset( $args['sectionmarker'] ) ) {
 
+                       $sectionmarker = $args['sectionmarker'];
 
+                       if (strlen($sectionmarker) < 3) {
 
+                               $error = self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-sectionmarker', $sectionmarker ) ) );
 
+                               wfProfileOut( __METHOD__ );
 
+                               return $error;
 
+                       }
 
+               }
 
+               $filtersections = "";
 
+               if ( isset( $args['filtersections'] ) ) {
 
+                       $filtersections = $args['filtersections'];
 
+               }
 
+               if ( empty($filtersections) == false && empty($sectionmarker) ) {
 
+                       $error = self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-sectionmarker-missing' ) ) );
 
+                       wfProfileOut( __METHOD__ );
 
+                       return $error;
 
+               }
 
+               $filterlines = "";
 
+               if ( isset( $args['filterlines'] ) ) {
 
+                       $filterlines = $args['filterlines'];
 
+               }
 
+        
 
+               $geshi = self::prepare( $text, $lang, $isurl, $sectionmarker, $filtersections, $filterlines );
 
                if( !$geshi instanceof GeSHi ) {
 
                        $error = self::formatError( htmlspecialchars( wfMsgForContent( 'syntaxhighlight-err-language' ) ) );
 
                        wfProfileOut( __METHOD__ );
 
@@ -203,6 +238,98 @@
 
        }
 
 
 
        /**
 
+        * Read a remote file and filter its contents. Filtering is done by including lines
 
+        * between certain markers or having a specific line number.
 
+        *
 
+        * @param string $url The URL of the remote file
 
+        * @param string $section_marker A regular expression matching the beginning of a section
 
+        * @param string $filter_sections A string containing section names which should be included, deliminated by a comma (,)
 
+        * @param string $filter_lines A string containing line numbers or ranges which should be included, deliminated by a comma (,)
 
+        */
 
+       private static function parseRemoteFile( $url, $section_marker, $filter_sections, $filter_lines ) {
 
+               $remotefile = fopen($url, 'r');
 
+               if ($remotefile == 0) {
 
+                       $output = "Error: could not open " . $input;
 
+               }
 
+               else {
 
+                       $section_array = explode(",", $filter_sections);
 
+                       $lines_array = explode(",", $filter_lines);
 
+            
 
+                       // build a array containing ranges of line numbers ((3,6),(10,10),(20,34),...)
 
+                       $linerange_array = array();
 
+                       foreach ($lines_array as $linefilter) {
 
+                               $range = explode("-", $linefilter);
 
+                               if (count($range) == 1) {
 
+                                       array_push($linerange_array, array($range[0], $range[0]));
 
+                               }
 
+                               else {
 
+                                       array_push($linerange_array, array($range[0], $range[1]));
 
+                               }
 
+                       }
 
+            
 
+                       // build a array of regular expressions for the required sections
 
+                       $section_regexp_array = array();
 
+                       foreach($section_array as $section) {
 
+                               $section_regexp = substr($section_marker, 0, strlen($section_marker)-1) . "\s*" . $section . substr($section_marker, 0, 1);
 
+                               array_push($section_regexp_array, $section_regexp);
 
+                       }
 
+
 
+                       // matches any section, even an empty one (which serves for signalling the end of a named section)
 
+                       $section_unknown_regexp = substr($section_marker, 0, strlen($section_marker)-1) . ".*" . substr($section_marker, 0, 1);
 
+            
 
+                       $section_started = false;
 
+                       $linecount = 0;
 
+                       $output = "";
 
+                       while (!feof($remotefile)) {
 
+                               $linecount += 1;
 
+                               $line = fgets($remotefile);
 
+                
 
+                               if ($section_started == true) {
 
+                                       // we found a required section previously
 
+                                       if (preg_match($section_unknown_regexp, $line)) {
 
+                                               // found the beginning of an unknown section or the end of the current
 
+                                               // required section
 
+                                               $section_started = false;
 
+                                       }
 
+                                       else {
 
+                                               $output .= $line;
 
+                                               continue;
 
+                                       }
 
+                               }
 
+
 
+                               $added = false;
 
+                               // check if the line should be included due to required line numbers
 
+                               foreach($linerange_array as $linerange) {
 
+                                       if ($linecount >= $linerange[0] && $linecount <= $linerange[1]) {
 
+                                               $output .= $line;
 
+                                               $added = true;
 
+                                               // additionally check if this line signals a section start
 
+                                               foreach($section_regexp_array as $section_regexp) {
 
+                                                       if (preg_match($section_regexp, $line)) {
 
+                                                               $section_started = true;
 
+                                                               break;
 
+                                                       }       
 
+                                               }
 
+                                               break;
 
+                                       }
 
+                               }
 
+                
 
+                               if($added == false) {
 
+                                       // line ranges did not match, check required sections
 
+                                       foreach($section_regexp_array as $section_regexp) {
 
+                                               if (preg_match($section_regexp, $line)) {
 
+                                                       $section_started = true;
 
+                                                       break;
 
+                                               }       
 
+                                       }
 
+                               }
 
+                       }
 
+               }
 
+               fclose($remotefile);
 
+               return $output;
 
+       }
 
+
 
+       /**
 
         * Initialise a GeSHi object to format some code, performing
 
         * common setup for all our uses of it
 
         *
 
@@ -210,9 +337,20 @@
 
         * @param string $lang
 
         * @return GeSHi
 
         */
 
-       private static function prepare( $text, $lang ) {
 
+       private static function prepare( $input, $lang, $url, $section_marker, $filter_sections, $filter_lines ) {
 
                self::initialise();
 
-               $geshi = new GeSHi( $text, $lang );
 
+               if ($url) {
 
+                       if (strlen($filter_sections) > 0 || strlen($filter_lines) > 0) {
 
+                               $output = self::parseRemoteFile( $input, $section_marker, $filter_sections, $filter_lines );
 
+                       }
 
+                       else {
 
+                               $output = file_get_contents($input);
 
+                       }
 
+               }
 
+               else {
 
+                       $output = $input;
 
+               }
 
+               $geshi = new GeSHi( $output, $lang );
 
                if( $geshi->error() == GESHI_ERROR_NO_SUCH_LANG ) {
 
                        return null;
 
                }
 
@@ -336,4 +474,4 @@
 
                return self::hSpecialVersion_GeSHi( $extensionTypes );
 
        }
 
 
 
-}
 
\ No newline at end of file
 
+}
Personal tools
Namespaces

Variants
Actions
Navigation
Support
Download
Development
Communication
Print/export
Toolbox