======================== The Docutils Publisher ======================== :Author: David Goodger :Contact: docutils-develop@lists.sourceforge.net :Date: $Date: 2012-01-03 19:23:53 +0000 (Tue, 03 Jan 2012) $ :Revision: $Revision: 7302 $ :Copyright: This document has been placed in the public domain. .. contents:: The ``docutils.core.Publisher`` class is the core of Docutils, managing all the processing and relationships between components. See `PEP 258`_ for an overview of Docutils components. The ``docutils.core.publish_*`` convenience functions are the normal entry points for using Docutils as a library. See `Inside A Docutils Command-Line Front-End Tool`_ for an overview of a typical Docutils front-end tool, including how the Publisher class is used. .. _PEP 258: ../peps/pep-0258.html .. _Inside A Docutils Command-Line Front-End Tool: ./cmdline-tool.html Publisher Convenience Functions =============================== Each of these functions set up a ``docutils.core.Publisher`` object, then call its ``publish`` method. ``docutils.core.Publisher.publish`` handles everything else. There are several convenience functions in the ``docutils.core`` module: :_`publish_cmdline`: for command-line front-end tools, like ``rst2html.py``. There are several examples in the ``tools/`` directory. A detailed analysis of one such tool is in `Inside A Docutils Command-Line Front-End Tool`_ :_`publish_file`: for programmatic use with file-like I/O. In addition to writing the encoded output to a file, also returns the encoded output as a string. :_`publish_string`: for programmatic use with string I/O. Returns the encoded output as a string. :_`publish_parts`: for programmatic use with string input; returns a dictionary of document parts. Dictionary keys are the names of parts, and values are Unicode strings; encoding is up to the client. Useful when only portions of the processed document are desired. See `publish_parts Details`_ below. There are usage examples in the `docutils/examples.py`_ module. :_`publish_doctree`: for programmatic use with string input; returns a Docutils document tree data structure (doctree). The doctree can be modified, pickled & unpickled, etc., and then reprocessed with `publish_from_doctree`_. :_`publish_from_doctree`: for programmatic use to render from an existing document tree data structure (doctree); returns the encoded output as a string. :_`publish_programmatically`: for custom programmatic use. This function implements common code and is used by ``publish_file``, ``publish_string``, and ``publish_parts``. It returns a 2-tuple: the encoded string output and the Publisher object. .. _Inside A Docutils Command-Line Front-End Tool: ./cmdline-tool.html .. _docutils/examples.py: ../../docutils/examples.py Configuration ------------- To pass application-specific setting defaults to the Publisher convenience functions, use the ``settings_overrides`` parameter. Pass a dictionary of setting names & values, like this:: overrides = {'input_encoding': 'ascii', 'output_encoding': 'latin-1'} output = publish_string(..., settings_overrides=overrides) Settings from command-line options override configuration file settings, and they override application defaults. For details, see `Docutils Runtime Settings`_. See `Docutils Configuration Files`_ for details about individual settings. .. _Docutils Runtime Settings: ./runtime-settings.html .. _Docutils Configuration Files: ../user/tools.html Encodings --------- The default output encoding of Docutils is UTF-8. If you have any non-ASCII in your input text, you may have to do a bit more setup. Docutils may introduce some non-ASCII text if you use `auto-symbol footnotes`_ or the `"contents" directive`_. .. _auto-symbol footnotes: ../ref/rst/restructuredtext.html#auto-symbol-footnotes .. _"contents" directive: ../ref/rst/directives.html#table-of-contents ``publish_parts`` Details ========================= The ``docutils.core.publish_parts`` convenience function returns a dictionary of document parts. Dictionary keys are the names of parts, and values are Unicode strings. Each Writer component may publish a different set of document parts, described below. Not all writers implement all parts. Parts Provided By All Writers ----------------------------- _`encoding` The output encoding setting. _`version` The version of Docutils used. _`whole` ``parts['whole']`` contains the entire formatted document. .. _HTML writer: Parts Provided By the HTML Writer --------------------------------- _`body` ``parts['body']`` is equivalent to parts['fragment_']. It is *not* equivalent to parts['html_body_']. _`body_prefix` ``parts['body_prefix']`` contains::
and, if applicable::
...
_`body_pre_docinfo` ``parts['body_pre_docinfo]`` contains (as applicable)::

...

...

_`body_suffix` ``parts['body_suffix']`` contains::
(the end-tag for ``
``), the footer division if applicable:: and:: _`docinfo` ``parts['docinfo']`` contains the document bibliographic data, the docinfo field list rendered as a table. _`footer` ``parts['footer']`` contains the document footer content, meant to appear at the bottom of a web page, or repeated at the bottom of every printed page. _`fragment` ``parts['fragment']`` contains the document body (*not* the HTML ````). In other words, it contains the entire document, less the document title, subtitle, docinfo, header, and footer. _`head` ``parts['head']`` contains ```` tags and the document ``...``. _`head_prefix` ``parts['head_prefix']`` contains the XML declaration, the DOCTYPE declaration, the ```` start tag and the ```` start tag. _`header` ``parts['header']`` contains the document header content, meant to appear at the top of a web page, or repeated at the top of every printed page. _`html_body` ``parts['html_body']`` contains the HTML ```` content, less the ```` and ```` tags themselves. _`html_head` ``parts['html_head']`` contains the HTML ```` content, less the stylesheet link and the ```` and ```` tags themselves. Since ``publish_parts`` returns Unicode strings and does not know about the output encoding, the "Content-Type" meta tag's "charset" value is left unresolved, as "%s":: The interpolation should be done by client code. _`html_prolog` ``parts['html_prolog]`` contains the XML declaration and the doctype declaration. The XML declaration's "encoding" attribute's value is left unresolved, as "%s":: The interpolation should be done by client code. _`html_subtitle` ``parts['html_subtitle']`` contains the document subtitle, including the enclosing ``

`` & ``

`` tags. _`html_title` ``parts['html_title']`` contains the document title, including the enclosing ``

`` & ``

`` tags. _`meta` ``parts['meta']`` contains all ```` tags. _`stylesheet` ``parts['stylesheet']`` contains the embedded stylesheet or stylesheet link. _`subtitle` ``parts['subtitle']`` contains the document subtitle text and any inline markup. It does not include the enclosing ``

`` & ``

`` tags. _`title` ``parts['title']`` contains the document title text and any inline markup. It does not include the enclosing ``

`` & ``

`` tags. Parts Provided by the PEP/HTML Writer ````````````````````````````````````` The PEP/HTML writer provides the same parts as the `HTML writer`_, plus the following: _`pepnum` ``parts['pepnum']`` contains Parts Provided by the S5/HTML Writer ```````````````````````````````````` The S5/HTML writer provides the same parts as the `HTML writer`_. Parts Provided by the LaTeX2e Writer ------------------------------------ See the template files for examples how these parts can be combined into a valid LaTeX document. abstract ``parts['abstract']`` contains the formatted content of the 'abstract' docinfo field. body ``parts['body']`` contains the document's content. In other words, it contains the entire document, except the document title, subtitle, and docinfo. This part can be included into another LaTeX document body using the ``\input{}`` command. body_pre_docinfo ``parts['body_pre_docinfo]`` contains the ``\maketitle`` command. dedication ``parts['dedication']`` contains the formatted content of the 'dedication' docinfo field. docinfo ``parts['docinfo']`` contains the document bibliographic data, the docinfo field list rendered as a table. With ``--use-latex-docinfo`` 'author', 'organization', 'contact', 'address' and 'date' info is moved to titledata. 'dedication' and 'abstract' are always moved to separate parts. fallbacks ``parts['fallbacks']`` contains fallback definitions for Docutils-specific commands and environments. head_prefix ``parts['head_prefix']`` contains the declaration of documentclass and document options. latex_preamble ``parts['latex_preamble']`` contains the argument of the ``--latex-preamble`` option. pdfsetup ``parts['pdfsetup']`` contains the PDF properties ("hyperref" package setup). requirements ``parts['requirements']`` contains required packages and setup before the stylesheet inclusion. stylesheet ``parts['stylesheet']`` contains the embedded stylesheet(s) or stylesheet loading command(s). subtitle ``parts['subtitle']`` contains the document subtitle text and any inline markup. title ``parts['title']`` contains the document title text and any inline markup. titledata ``parts['titledata]`` contains the combined title data in ``\title``, ``\author``, and ``\data`` macros. With ``--use-latex-docinfo``, this includes the 'author', 'organization', 'contact', 'address' and 'date' docinfo items.