logo

Pyg Filter

← Back to Filter List

Pyg


↓ examples

Apply Pygments syntax highlighting. Image output formats require PIL to be installed.

Aliases for this filter

  • pyg
  • pygments

Converts from file formats:

  • .*

To file formats:

  • .html
  • .tex
  • .svg
  • .txt
  • .png
  • .bmp
  • .gif
  • .jpg
  • .css
  • .sty

Available settings:

SettingDescriptionDefault
add-new-filesBoolean or list of extensions/patterns to match.False
added-in-versionDexy version when this filter was first available.
additional-doc-filtersFilters to apply to additional documents created as side effects.{}
additional-doc-settingsSettings to apply to additional documents created as side effects.{}
allow-unknown-extWhether to allow unknown file extensions to be parsed with the TextLexer by default instead of raising an exception.True
allow-unprintable-inputWhether to allow unprintable input to be replaced with dummy text instead of raising an exception.True
data-typeAlias of custom data class to use to store filter output.generic
examplesTemplates which should be used as examples for this filter.[u'pygments', u'pygments-image', u'pygments-stylesheets']
exclude-add-new-filesList of patterns to skip even if they match add-new-files.[]
exclude-new-files-from-dirList of directories to skip when adding new files.[]
extFile extension to output.None
extension-mapDictionary mapping input extensions to default output extensions.None
formatter-settingsList of all settings which will be passed to the formatter constructor.[u'style', u'full', u'linenos', u'noclasses']
fullPygments formatter option: output a 'full' document including header/footer tags.None
helpHelpstring for plugin.Apply Pygments syntax highlighting. Image output formats require PIL to be installed.
input-extensionsList of extensions which this filter can accept as input.[u'.*']
keep-originalsWhether, if additional-doc-filters are specified, the original unmodified docs should also be added.False
lexerThe name of the pygments lexer to use (will normally be determined automatically, only use this if you need to override the default setting or your filename isn't mapped to the lexer you want to use.None
lexer-argsDictionary of custom arguments to be passed directly to the lexer.{}
lexer-settingsList of all settings which will be passed to the lexer constructor.[]
line-numbersAlternative name for 'linenos'.None
linenosWhether to include line numbers. May be set to 'table' or 'inline'.None
mkdirA directory which should be created in working dir.None
mkdirsA list of directories which should be created in working dir.[]
noclassesIf set to true, token tags will not use CSS classes, but inline styles.None
nodocWhether filter should be excluded from documentation.False
outputWhether to output results of this filter by default by reporters such as 'output' or 'website'.False
output-extensionsList of extensions which this filter can produce as output.[u'.html', u'.tex', u'.svg', u'.txt', u'.png', u'.bmp', u'.gif', u'.jpg', u'.css', u'.sty']
override-workspace-exclude-filtersIf True, document will be populated to other workspaces ignoring workspace-exclude-filters.False
preserve-prior-data-classWhether output data class should be set to match the input data class.False
require-outputShould dexy raise an exception if no output is produced by this filter?True
styleFormatter style to output.default
tagsTags which describe the filter.[]
unprintable-input-textDummy text to use instead of unprintable binary input.not printable
variablesA dictionary of variable names and values to make available to this filter.{}
varsA dictionary of variable names and values to make available to this filter.{}
workspace-exclude-filtersFilters whose output should be excluded from workspace.[u'pyg']
workspace-includesIf set to a list of filenames or extensions, only these will be populated to working dir.None

Pygments Examples

The pygments filter applies syntax highlighting to source code.

Here is some raw source code:

print "Hello!"

And here is how we specify in dexy.yaml that the pyg filter should be applied to this file:

- hello.py|pyg

The default format is HTML, so the code will come out looking like this, with HTML markup applied:

<div class="highlight"><pre><a name="hello.py-pyg.html-1"></a><span class="k">print</span> <span class="s">&quot;Hello!&quot;</span>
</pre></div>

If we follow the pyg filter with the l filter (which does nothing, but it only accepts input of documents with .tex extension):

- hello.py|pyg|l

then pygments will output LaTeX markup instead:

\begin{Verbatim}[commandchars=\\\{\}]
\PY{k}{print} \PY{l+s}{\PYZdq{}}\PY{l+s}{Hello!}\PY{l+s}{\PYZdq{}}
\end{Verbatim}

Pygments will automatically determine which lexer to use based on the file extension of the incoming file. In the above examples, the .py file extension tells pygments that it should use the python lexer. If you need a different lexer than the default for the file extension, then you can pass a custom lexer argument:

- hello.txt|pyg:
    - pyg: { 'lexer' : 'python' }

You can see a list of all the available lexers, the file extensions they correspond to, and the aliases you can use to specify them, as follows:

$ pygmentize -L lexers
Pygments version 1.6, (c) 2006-2013 by Georg Brandl.

Lexers:
~~~~~~~
* Clipper, XBase:
    FoxPro (filenames *.PRG, *.prg)
* Cucumber, cucumber, Gherkin, gherkin:
    Gherkin (filenames *.feature)
* RobotFramework, robotframework:
    RobotFramework (filenames *.txt, *.robot)
* abap:
    ABAP (filenames *.abap)
* ada, ada95ada2005:
    Ada (filenames *.adb, *.ads, *.ada)

Dexy adds some custom file extension -> lexer mappings of its own, such as .pycon and .rbcon for python console and ruby console transcripts.

Custom formatter options can also be passed. Note that these options may be different for the HTMLFormatter and, say, the LatexFormatter. For information consult the Pygments Documentation for the formatter you are using. Here is an example of setting the noclasses attribute to True:

- hello.py|pyg|-noclasses:
    - pyg: { 'noclasses' : True }

And here is the resulting HTML:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><a name="hello.py-pyg.html-1"></a><span style="color: #008000; font-weight: bold">print</span> <span style="color: #BA2121">&quot;Hello!&quot;</span>
</pre></div>

As a reminder here is the default HTML again:

<div class="highlight"><pre><a name="hello.py-pyg.html-1"></a><span class="k">print</span> <span class="s">&quot;Hello!&quot;</span>
</pre></div>

Pygments Image Examples

It is possible to generate image files from source code. Here is how to tell pygments to output a jpg:

- hello.py|pyg|jn
hello-py.jpg

And here's a png, this time without line numbers and using a custom font:

- hello.py|pyg|pn:
    - pyg: { 'line_numbers' : False, 'font_name' : 'Source Code Pro' }
hello-py.png

Here's a gif with the bw (black and white) style:

- hello.py|pyg|gn:
    - pyg: { 'style' : 'bw' }
hello-py.gif

Consult the Pygments Documentation for details about other options you can pass to these formatters.

Pygments Stylesheets Examples

The pygments filter applies syntax highlighting to source code. For HTML and LaTeX, the default pygments output wraps the code with a class or macro, and a stylesheet needs to be applied to actually colorize the output.

You can use the pygmentize command line tool to list available styles:

$ pygmentize -L styles
Pygments version 1.6, (c) 2006-2013 by Georg Brandl.

Styles:
~~~~~~~
* monokai:
    This style mimics the Monokai color scheme.
* manni:
    A colorful style, inspired by the terminal highlighting style.
* rrt:
    Minimalistic "rrt" theme, based on Zap and Emacs defaults.
* perldoc:
    Style similar to the style used in the perldoc code blocks.
* borland:
    Style similar to the style used in the borland IDEs.

to generate a CSS stylesheet for use with HTML:

$ pygmentize -S manni -f html > manni.css
$ head -n 5 manni.css
.hll { background-color: #ffffcc }
.c { color: #0099FF; font-style: italic } /* Comment */
.err { color: #AA0000; background-color: #FFAAAA } /* Error */
.k { color: #006699; font-weight: bold } /* Keyword */
.o { color: #555555 } /* Operator */

or to generate a .sty file for use with LaTeX:

$ pygmentize -S manni -f latex > manni.sty
$ head -n 5 manni.sty

\makeatletter
\def\PY@reset{\let\PY@it=\relax \let\PY@bf=\relax%
    \let\PY@ul=\relax \let\PY@tc=\relax%
    \let\PY@bc=\relax \let\PY@ff=\relax}

The pygments filter also has some built-in ways to get stylesheets in your project.

Any blank file ending in .css or .sty passed through the pyg filter will have stylesheet contents generated if the final output extension of the file is also set to .css or .sty:

- pastie.css|pyg:
    - pyg: { 'ext' : '.css' }
    - contents: ' '

Here is an excerpt from the generated stylesheet:

.hll { background-color: #ffffcc }
.c { color: #408080; font-style: italic } /* Comment */
.err { border: 1px solid #FF0000 } /* Error */
.k { color: #008000; font-weight: bold } /* Keyword */
.o { color: #666666 } /* Operator */
.cm { color: #408080; font-style: italic } /* Comment.Multiline */
.cp { color: #BC7A00 } /* Comment.Preproc */
.c1 { color: #408080; font-style: italic } /* Comment.Single */
.cs { color: #408080; font-style: italic } /* Comment.Special */
.gd { color: #A00000 } /* Generic.Deleted */

Here is a LaTeX version:

- pastie.sty|pyg:
    - pyg: { 'ext' : '.sty' }
    - contents: ' '

Here is an excerpt from the generated .sty file:

\makeatletter
\def\PY@reset{\let\PY@it=\relax \let\PY@bf=\relax%
    \let\PY@ul=\relax \let\PY@tc=\relax%
    \let\PY@bc=\relax \let\PY@ff=\relax}
\def\PY@tok#1{\csname PY@tok@#1\endcsname}
\def\PY@toks#1+{\ifx\relax#1\empty\else%
    \PY@tok{#1}\expandafter\PY@toks\fi}
\def\PY@do#1{\PY@bc{\PY@tc{\PY@ul{%
    \PY@it{\PY@bf{\PY@ff{#1}}}}}}}

You can also insert style definitions directly into the header of a document. Jinja has a pygments object which contains entries for each of the avaliable pygments styles, for html or latex.

Here's an example of a HTML file which incorporates CSS and also lists the available styles:

<html>
    <head>
        <style type="text/css">

{{ pygments['pastie.css'] }}

        </style>
    </head>
    <body>
        <p>Here is a list of all the available stylesheets:</p>
        <ul>
            {% for k in sorted(pygments) -%}
            <li>{{ k }}</li>
            {% endfor -%}
        </ul>
    </body>
</html>

This is run by passing the html through the jinja filter:

- example.html|jinja

Here is the resulting HTML:

<html>
    <head>
        <style type="text/css">

.hll { background-color: #ffffcc }
.c { color: #888888 } /* Comment */
.err { color: #a61717; background-color: #e3d2d2 } /* Error */
.k { color: #008800; font-weight: bold } /* Keyword */
.cm { color: #888888 } /* Comment.Multiline */
.cp { color: #cc0000; font-weight: bold } /* Comment.Preproc */
.c1 { color: #888888 } /* Comment.Single */
.cs { color: #cc0000; font-weight: bold; background-color: #fff0f0 } /* Comment.Special */
.gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */
.ge { font-style: italic } /* Generic.Emph */
.gr { color: #aa0000 } /* Generic.Error */
.gh { color: #333333 } /* Generic.Heading */
.gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */
.go { color: #888888 } /* Generic.Output */
.gp { color: #555555 } /* Generic.Prompt */
.gs { font-weight: bold } /* Generic.Strong */
.gu { color: #666666 } /* Generic.Subheading */
.gt { color: #aa0000 } /* Generic.Traceback */
.kc { color: #008800; font-weight: bold } /* Keyword.Constant */
.kd { color: #008800; font-weight: bold } /* Keyword.Declaration */
.kn { color: #008800; font-weight: bold } /* Keyword.Namespace */
.kp { color: #008800 } /* Keyword.Pseudo */
.kr { color: #008800; font-weight: bold } /* Keyword.Reserved */
.kt { color: #888888; font-weight: bold } /* Keyword.Type */
.m { color: #0000DD; font-weight: bold } /* Literal.Number */
.s { color: #dd2200; background-color: #fff0f0 } /* Literal.String */
.na { color: #336699 } /* Name.Attribute */
.nb { color: #003388 } /* Name.Builtin */
.nc { color: #bb0066; font-weight: bold } /* Name.Class */
.no { color: #003366; font-weight: bold } /* Name.Constant */
.nd { color: #555555 } /* Name.Decorator */
.ne { color: #bb0066; font-weight: bold } /* Name.Exception */
.nf { color: #0066bb; font-weight: bold } /* Name.Function */
.nl { color: #336699; font-style: italic } /* Name.Label */
.nn { color: #bb0066; font-weight: bold } /* Name.Namespace */
.py { color: #336699; font-weight: bold } /* Name.Property */
.nt { color: #bb0066; font-weight: bold } /* Name.Tag */
.nv { color: #336699 } /* Name.Variable */
.ow { color: #008800 } /* Operator.Word */
.w { color: #bbbbbb } /* Text.Whitespace */
.mf { color: #0000DD; font-weight: bold } /* Literal.Number.Float */
.mh { color: #0000DD; font-weight: bold } /* Literal.Number.Hex */
.mi { color: #0000DD; font-weight: bold } /* Literal.Number.Integer */
.mo { color: #0000DD; font-weight: bold } /* Literal.Number.Oct */
.sb { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Backtick */
.sc { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Char */
.sd { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Doc */
.s2 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Double */
.se { color: #0044dd; background-color: #fff0f0 } /* Literal.String.Escape */
.sh { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Heredoc */
.si { color: #3333bb; background-color: #fff0f0 } /* Literal.String.Interpol */
.sx { color: #22bb22; background-color: #f0fff0 } /* Literal.String.Other */
.sr { color: #008800; background-color: #fff0ff } /* Literal.String.Regex */
.s1 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Single */
.ss { color: #aa6600; background-color: #fff0f0 } /* Literal.String.Symbol */
.bp { color: #003388 } /* Name.Builtin.Pseudo */
.vc { color: #336699 } /* Name.Variable.Class */
.vg { color: #dd7700 } /* Name.Variable.Global */
.vi { color: #3333bb } /* Name.Variable.Instance */
.il { color: #0000DD; font-weight: bold } /* Literal.Number.Integer.Long */

        </style>
    </head>
    <body>
        <p>Here is a list of all the available stylesheets:</p>
        <ul>
            <li>autumn.css</li>
            <li>autumn.html</li>
            <li>autumn.tex</li>
            <li>borland.css</li>
            <li>borland.html</li>
            <li>borland.tex</li>
            <li>bw.css</li>
            <li>bw.html</li>
            <li>bw.tex</li>
            <li>colorful.css</li>
            <li>colorful.html</li>
            <li>colorful.tex</li>
            <li>default.css</li>
            <li>default.html</li>
            <li>default.tex</li>
            <li>emacs.css</li>
            <li>emacs.html</li>
            <li>emacs.tex</li>
            <li>friendly.css</li>
            <li>friendly.html</li>
            <li>friendly.tex</li>
            <li>fruity.css</li>
            <li>fruity.html</li>
            <li>fruity.tex</li>
            <li>manni.css</li>
            <li>manni.html</li>
            <li>manni.tex</li>
            <li>monokai.css</li>
            <li>monokai.html</li>
            <li>monokai.tex</li>
            <li>murphy.css</li>
            <li>murphy.html</li>
            <li>murphy.tex</li>
            <li>native.css</li>
            <li>native.html</li>
            <li>native.tex</li>
            <li>pastie.css</li>
            <li>pastie.html</li>
            <li>pastie.tex</li>
            <li>perldoc.css</li>
            <li>perldoc.html</li>
            <li>perldoc.tex</li>
            <li>rrt.css</li>
            <li>rrt.html</li>
            <li>rrt.tex</li>
            <li>tango.css</li>
            <li>tango.html</li>
            <li>tango.tex</li>
            <li>trac.css</li>
            <li>trac.html</li>
            <li>trac.tex</li>
            <li>vim.css</li>
            <li>vim.html</li>
            <li>vim.tex</li>
            <li>vs.css</li>
            <li>vs.html</li>
            <li>vs.tex</li>
            </ul>
    </body>
</html>
Filter Source Code
class PygmentsFilter(DexyFilter):
    """
    Apply Pygments <http://pygments.org/> syntax highlighting.
    
    Image output formats require PIL to be installed.
    """
    aliases = ['pyg', 'pygments']
    IMAGE_OUTPUT_EXTENSIONS = ['.png', '.bmp', '.gif', '.jpg']
    MARKUP_OUTPUT_EXTENSIONS = [".html", ".tex", ".svg", ".txt"] # make sure .html is first so it is default output format
    LEXER_ERR_MSG = """Pygments doesn't know how to syntax highlight files like '%s' (for '%s').\
    You might need to specify the lexer manually."""

    _settings = {
            'examples' : ['pygments', 'pygments-image', 'pygments-stylesheets'],
            'input-extensions' : [".*"],
            'output-extensions' : MARKUP_OUTPUT_EXTENSIONS + IMAGE_OUTPUT_EXTENSIONS + ['.css', '.sty'],

            'lexer' : ("""The name of the pygments lexer to use (will normally
            be determined automatically, only use this if you need to override
            the default setting or your filename isn't mapped to the lexer you
            want to use.""", None),
            'allow-unknown-ext' : ("""Whether to allow unknown file extensions
                to be parsed with the TextLexer by default instead of raising
                an exception.""", True),
            'allow-unprintable-input' : ("""Whether to allow unprintable input
                to be replaced with dummy text instead of raising an exception.""",
                True),
            'unprintable-input-text' : ("""Dummy text to use instead of
                unprintable binary input.""", 'not printable'),
            'lexer-args' : (
                "Dictionary of custom arguments to be passed directly to the lexer.",
                {}
                ),
            'lexer-settings' : (
                "List of all settings which will be passed to the lexer constructor.",
                []
            ),
            'formatter-settings' : (
                """List of all settings which will be passed to the formatter
                constructor.""", ['style', 'full', 'linenos', 'noclasses']
            ),

            'style' : ( "Formatter style to output.", 'default'),
            'noclasses' : ( "If set to true, token <span> tags will not use CSS classes, but inline styles.", None),
            'full' : ("""Pygments formatter option: output a 'full' document
                including header/footer tags.""", None),
            'linenos' : ("""Whether to include line numbers. May be set to
                'table' or 'inline'.""", None),
            'line-numbers' : ("""Alternative name for 'linenos'.""", None),
            }

    lexer_cache = {}

    def data_class_alias(klass, file_ext):
        if file_ext in klass.MARKUP_OUTPUT_EXTENSIONS:
            return 'sectioned'
        else:
            return 'generic'

    def docmd_css(klass, style='default'):
        """
        Prints out CSS for the specified style.
        """
        print klass.generate_css(style)

    def docmd_sty(klass, style='default'):
        """
        Prints out .sty file (latex) for the specified style.
        """
        print klass.generate_sty(style)

    def generate_css(self, style='default'):
        formatter = HtmlFormatter(style=style)
        return formatter.get_style_defs()

    def generate_sty(self, style='default'):
        formatter = LatexFormatter(style=style)
        return formatter.get_style_defs()

    def calculate_canonical_name(self):
        ext = self.prev_ext
        if ext in [".css", ".sty"] and self.ext == ext:
            return self.doc.name
        elif self.alias == 'htmlsections':
            return self.doc.name
        else:
            return "%s%s" % (self.doc.name, self.ext)

    def constructor_args(self, constructor_type, custom_args=None):
        if custom_args:
            args = custom_args
        else:
            args = {}

        for argname in self.setting("%s-settings" % constructor_type):
            if self.setting(argname):
                args[argname] = self.setting(argname)
        return args

    def lexer_alias(self, ext):
        if self.setting('lexer'):
            self.log_debug("custom lexer %s specified" % self.setting('lexer'))
            return self.setting('lexer')

        is_json_file = ext in ('.json', '.dexy') or self.output_data.name.endswith(".dexy")

        if is_json_file and (pygments.__version__ < '1.5'):
            return "javascript"
        elif is_json_file:
            return "json"

        if ext == '.Makefile' or (ext == '' and 'Makefile' in self.input_data.name):
            return 'makefile'

        try:
            return file_ext_to_lexer_alias_cache[ext]
        except KeyError:
            pass

    def create_lexer_instance(self):
        ext = self.prev_ext
        lexer_alias = self.lexer_alias(ext)
        lexer_args = self.constructor_args('lexer')
        lexer_args.update(self.setting('lexer-args'))

        if not lexer_alias:
            msg = self.LEXER_ERR_MSG
            msgargs = (self.input_data.name, self.key)

            if self.setting('allow-unknown-ext'):
                self.log_warn(msg % msgargs)
                lexer_alias = 'text'
            else:
                raise dexy.exceptions.UserFeedback(msg % msgargs)

        if lexer_alias in pygments_lexer_cache and not lexer_args:
            return pygments_lexer_cache[lexer_alias]
        else:
            lexer = get_lexer_by_name(lexer_alias, **lexer_args)
            if not lexer_args:
                pygments_lexer_cache[lexer_alias] = lexer
            return lexer

        return lexer

    def create_formatter_instance(self):
        if self.setting('line-numbers') and not self.setting('linenos'):
            self.update_settings({'linenos' : self.setting('line-numbers')})

        formatter_args = self.constructor_args('formatter', {
            'lineanchors' : self.output_data.web_safe_document_key() })
        self.log_debug("creating pygments formatter with args %s" % (formatter_args))

        return get_formatter_for_filename(self.output_data.name, **formatter_args)

    def process(self):
        if self.ext in self.IMAGE_OUTPUT_EXTENSIONS:
            try:
                import PIL
                PIL # because pyflakes
            except ImportError:
                print "python imaging library is required by pygments to create image output"
                raise dexy.exceptions.InactivePlugin('pyg')

        ext = self.prev_ext
        if ext in [".css", ".sty"] and self.ext == ext:
            # Special case if we get a virtual empty file, generate style file

            self.log_debug("creating a style file in %s" % self.key)
            if ext == '.css':
                output = self.generate_css(self.setting('style'))
            elif ext == '.sty':
                output = self.generate_sty(self.setting('style'))
            else:
                msg = "pyg filter doesn't know how to generate a stylesheet for %s extension"
                msgargs = (ext)
                raise dexy.commands.UserFeedback(msg % msgargs)

            self.output_data.set_data(output)
            self.update_all_args({'override-workspace-exclude-filters' : True })

        else:
            lexer = self.create_lexer_instance()

            if self.ext in self.IMAGE_OUTPUT_EXTENSIONS:
                # Place each section into an image.
                for k, v in self.input_data.iteritems():
                    formatter = self.create_formatter_instance()
                    output_for_section = highlight(unicode(v).decode("utf-8"), lexer, formatter)
                    new_doc_name = "%s--%s%s" % (self.doc.key.replace("|", "--"), k, self.ext)
                    self.add_doc(new_doc_name, output_for_section)

                # Place entire contents into main file.
                formatter = self.create_formatter_instance()
                self.update_all_args({'override-workspace-exclude-filters' : True })
                with open(self.output_filepath(), 'wb') as f:
                    f.write(highlight(unicode(self.input_data), lexer, formatter))

            else:
                formatter = self.create_formatter_instance()
                for section_name, section_input in self.input_data.iteritems():
                    try:
                        section_output = highlight(unicode(section_input).decode("utf-8"), lexer, formatter)
                    except UnicodeDecodeError:
                        if self.setting('allow-unprintable-input'):
                            section_input = self.setting('unprintable-input-text')
                            section_output = highlight(section_input, lexer, formatter)
                        else:
                            raise
                    self.output_data[section_name] = section_output
                self.output_data.save()

Content © 2013 Dr. Ana Nelson | Site Design © Copyright 2011 Andre Gagnon | All Rights Reserved.