Extreme Markup Languages |
The legendary computer luminary Donald Knuth bibref [Knuth WWW] once asked why nobody ever takes a computer program to bed to read. The answer was simple — most computer programs are unreadable. Comments are too few and too far between, and the simple process of reading becomes the painful process of reverse engineering. Knuth's solution was that programs should highlight[@style='bold'] not be 90% code and 10% comments, they should be 90% descriptive text and 10% code. This was the start of Literate Programming, or “ acronym LitProg” bibref [LP WWW].
In the early days of computer programming, documentation within the source code of a program was an unaffordable luxury. Some systems actively stripped comments from code in order to save storage. Although storage no longer tends to be a significant limit, programmers still spend little time on documenting their code. There are several reasons for this. One is that programmers tend to be judged by whether their code works (or appears to), not by how well it is commented. Further, if project managers need to cut something from a development schedule, documentation is an item that can be removed from the schedule without forcibly affecting the delivered functionality.
On a personal level, during the fleeting moments that a programmer writes a particular piece of code, that code can appear to be so clear and obvious that documentation seems all but unnecessary. It is only when the programmer returns to the code after a month or more, all moments of clarity lost in the past, that the impact becomes obvious. The situation is even worse when a different programmer has to work on the code. With no documentation to read, the code needs to be reverse engineered in order to understand its intent and logic, but this is difficult to do with sufficient accuracy. Too often, the code is rewritten in the belief that rewriting is easier than reverse engineering. This squanders any experience gained by the original programmer, while simultaneously introducing new bugs into the code base.
No documentation system can force an undisciplined or lazy programmer to document their code, but there are things that can be done to make the task less onerous. acronym LitProg tools, which allow code fragments to be interspersed within the documentation, put the documentation for a piece of code right beside that code, in the same file. Compared to having code and documentation in separate files, this greatly increases the chance that as code is modified, the documentation is also modified to keep it up to date.
A “literate program” (or “literate document”) is a human readable document containing short sections of code (known variously as “macros”, “chunks”, or “fragments”), written and ordered so that it can be understood easily by people. By contrast, most computer programs are ordered purely for the benefit of program compilers. In a literate program, source code fragments (or any textual fragments) can appear in any suitable order. When the literate document is processed, the code fragments are assembled into the order required to produce the source files by “tangling” the document, to introduce Knuth's terminology. Literate documents are also “woven” to convert them into a final documentation format. Traditionally the documentation format was TeX or LaTeX, but these days it can also be acronym.grp acronym (X)HTML expansion [(Extensible) Hypertext Markup Language], acronym.grp acronym XSLFO expansion [Extensible Stylesheet Language Formatting Objects], or acronym.grp acronym PDF expansion [Portable Document Format, aka Acrobat].
What follows is a non-exhaustive list of acronym LitProg tools. All of these tools predate XML. References to further information can be found in the bibliography.
WEB |
paraWEB was Knuth's original acronym LitProg system for Pascal. WEB directly marks up many of the syntactic features of Pascal, so that in creating a valid WEB document, a programmer has pre-validated much of the syntax of the code fragments. Note that WEB was written before the acronym.grp acronym WWW expansion [World Wide Web] came to prominence. Knuth's choice of name relates to his ideas of tangling and weaving. bibref [Knuth 92] |
CWEB |
paraAlso produced by Knuth's group, CWEB supported C rather than Pascal. It has now been extended to handle C++ and Java as well. bibref [CWEB WWW] |
FWEB |
paraFWEB is a multi-language acronym LitProg tool which is similar in spirit to WEB & CWEB. It was the first acronym LitProg tool to support Fortran. bibref [FWEB WWW] |
noweb |
paranoweb is the most well-known of the language-insensitive acronym LitProg tools. These tools do not provide any syntactic support for any computer languages, and treat all code fragments as nothing more than text fragments. Language-insensitive acronym LitProg tools can be used for any (textual) programming language or control files, so their loss of syntactic support is compensated for by a gain in flexibility. bibref [noweb WWW] |
FunnelWeb |
paraFunnelWeb is another language-insensitive tool. Its unique feature is that its macros can have parameters, providing some of the power of a language pre-processor. xmLP, the XML acronym LitProg tool described in this paper, takes its inspiration most strongly from FunnelWeb. bibref [FunnelWeb WWW] |
SWEB |
paraSWEB is C. Michael Sperberg-McQueen's SGML acronym LitProg tool. It was the first acronym LitProg tool whose document format could feasibly be parsed by something other than the tool itself. bibref [SWEB WWW] |
Sun's Javadoc is a powerful tool for generating reference documentation from comments embedded in Java code, and has inspired similar tools for other programming languages. Javadoc is ideal for documenting the available methods & classes in a Java acronym.grp acronym API expansion [Application Programming Interface]. However, Javadoc is not a LitProg tool.
The documentation that Javadoc produces extends down only the the method signatures. It does not provide any support for documenting the workings of individual methods. It does not allow the order in which methods/classes are presented to be controlled to improve readability. These are not criticisms, just observations. There is no such thing as one size fits all documentation. Indeed, there are at least 3 major classes of documentation:
acronym LitProg tools do a good job of generating detailed documentation. Javadoc does a good job of generating reference documentation. Neither provides sufficient support for generating good user documentation. So, not all documentation is the same, and no documentation tool is suitable for every type of documentation. This paper focuses on acronym LitProg tools, and hence on the problem of creating detailed documentation of the workings of program code.
This paper was written as a literate program, using an extended version of the “Extreme Markup Languages 2002” DTD. The literate document was processed twice using an XML LitProg tool, “xmLP” bibref [xmLP WWW], which is described in this paper. The literate document was first “tangled”, where the macros were expanded to produce the source files. It was then “woven”, where the macros were cross-referenced and this document was generated. Both processes need to resolve the macros in the document, but for different purposes.
highlight[@style='bold'] No source code fragments were copied into the literate document, because the literate document is the original source material from which the source code files are produced.
The following scenario illustrates how literate programs can be a valuable tool for maintaining synchronized files. highlight[@style='bold'] Note: if at any stage you want to jump ahead to read about the acronym LitProg tool “xmLP”, you can go directly to Section xref 3. However, you are encouraged to read this section first to get a sense of what a acronym LitProg tool needs to achieve.
highlight[@style='ital'] This demonstrates the documentation created by xmLP. So this section shows the output format. The input format is discussed in Section xref 3.
Consider the problem of representing the way a particular stock market share price changes over time (a “time series”). Taking a simplified view, a single daily price summary, which is an “event” from the time series, can be written as
highlight[@style='bold'] xmLP Macro “Time Series Event Instance” [ highlight[@style='bold'] #1] =
<event date=" highlight[@style='bold'] 2002-02-20"> highlight[@style='bold'] <open> highlight[@style='bold'] 85.70</open> highlight[@style='bold'] <high> highlight[@style='bold'] 92.10</high> highlight[@style='bold'] <low> highlight[@style='bold'] 81.37</low> highlight[@style='bold'] <close> highlight[@style='bold'] 86.05</close> highlight[@style='bold'] <volume multiplier=" highlight[@style='bold'] 1000"> highlight[@style='bold'] 811786</volume> highlight[@style='bold'] </event>
highlight[@style='ital'] This macro is invoked in file #2 (Figure xref 16)
highlight[@style='ital'] This macro is invoked in file #4 (Figure xref 20)
highlight[@style='ital'] Note: this shows how xmLP XML macro definitions are “woven” into the documentation. Note the automatically generated cross-references.
Here, “date” is the date of the event, “open” is the opening (starting) price for that day, “high” and “low” are the maximum and minimum prices for that day (respectively), and “close” is the closing (final) price for that day. The “volume” is the number of shares traded during that day, and is commonly given in terms of thousands of shares traded.
The purpose of this example is to produce highlight[@style='bold'] both a DTD and a W3C XML Schema to describe this event structure, within the limits of what each of these schema technologies can do. A knowledge of DTD and W3C XML Schema constructs is assumed.
The “open”, “high”, “low”, and “close” elements each contain a decimal number. In the DTD, decimal numbers can only be represented as unconstrained text. However, a suitably named entity can be used to suggest to human readers that decimal values should be used.
highlight[@style='bold'] xmLP Macro “DTD: decimal pseudo-definition” [ highlight[@style='bold'] #2] =
<!ENTITY % Decimal "#PCDATA">
highlight[@style='ital'] This macro is invoked in macro #3 (Figure xref 3)
highlight[@style='ital'] Note: this shows how xmLP text macro definitions are “woven” into the documentation.
From a machine perspective, this is nothing more than a syntactic nicety. However, it makes maintenance easier for humans (by making the intent clear), and that makes it worth doing.
highlight[@style='bold'] xmLP Macro “DTD: financial elements” [ highlight[@style='bold'] #3] =
highlight[@style='ital'] {DTD: decimal pseudo-definition[2], Figure xref 2}
<!ELEMENT open (%Decimal;)>
<!ELEMENT high (%Decimal;)>
<!ELEMENT low (%Decimal;)>
<!ELEMENT close (%Decimal;)>
highlight[@style='ital'] This macro is also defined in macro #6 (Figure xref 6)
highlight[@style='ital'] This macro is invoked in file #1 (Figure xref 14)
highlight[@style='ital'] Note: this shows how invocations (expansions) of one macro inside another are indicated and cross-referenced in the documentation.
highlight[@style='ital'] Note: this macro is defined in multiple sections that are concatenated in document order to produce the complete content of the macro.
The W3C XML Schema datatypes contain a suitable decimal type, “xsd:decimal”, so the Schema equivalent is straightforward.
highlight[@style='bold'] xmLP Macro “W3C XML Schema: financial elements” [ highlight[@style='bold'] #4] =
<xsd:element name=" highlight[@style='bold'] open" type=" highlight[@style='bold'] xsd:decimal"/> <xsd:element name=" highlight[@style='bold'] high" type=" highlight[@style='bold'] xsd:decimal"/> <xsd:element name=" highlight[@style='bold'] low" type=" highlight[@style='bold'] xsd:decimal"/> <xsd:element name=" highlight[@style='bold'] close" type=" highlight[@style='bold'] xsd:decimal"/>
highlight[@style='ital'] This macro is also defined in macro #7 (Figure xref 7)
highlight[@style='ital'] This macro is invoked in file #3 (Figure xref 18)
The “volume” element contains a non-negative integer value (number of shares traded). It also has a positive integer “multiplier” attribute, since the volume is typically given in units of thousands of shares. As before, in the DTD the values are simply unconstrained text.
highlight[@style='bold'] xmLP Macro “DTD: integer pseudo-definitions” [ highlight[@style='bold'] #5] =
<!ENTITY % NonNegativeInteger "#PCDATA"> <!ENTITY % PositiveInteger "CDATA">
highlight[@style='ital'] This macro is invoked in macro #6 (Figure xref 6)
The DTD nonetheless allows a default value of ‘1’ to be defined for the “multiplier” attribute, so that its use with the “volume” element is optional.
highlight[@style='bold'] xmLP Macro “DTD: financial elements” [ highlight[@style='bold'] #6] =
highlight[@style='ital'] {DTD: integer pseudo-definitions[5], Figure xref 5}
<!ELEMENT volume (%NonNegativeInteger;)>
<!ATTLIST volume
multiplier %PositiveInteger; "1">
highlight[@style='ital'] This macro is also defined in macro #3 (Figure xref 3)
highlight[@style='ital'] This macro is invoked in file #1 (Figure xref 14)
The W3C XML Schema data types contain the necessary integer data types. Although the Schema version is longer, it defines the same structure for the “volume” element.
highlight[@style='bold'] xmLP Macro “W3C XML Schema: financial elements” [ highlight[@style='bold'] #7] =
<xsd:element name=" highlight[@style='bold'] volume"> highlight[@style='bold'] <xsd:complexType> highlight[@style='bold'] <xsd:simpleContent> highlight[@style='bold'] <xsd:extension base=" highlight[@style='bold'] xsd:nonNegativeInteger"> highlight[@style='bold'] <xsd:attribute name=" highlight[@style='bold'] multiplier" default=" highlight[@style='bold'] 1" type=" highlight[@style='bold'] xsd:positiveInteger"/> highlight[@style='bold'] </xsd:extension> highlight[@style='bold'] </xsd:simpleContent> highlight[@style='bold'] </xsd:complexType> highlight[@style='bold'] </xsd:element>
highlight[@style='ital'] This macro is also defined in macro #4 (Figure xref 4)
highlight[@style='ital'] This macro is invoked in file #3 (Figure xref 18)
The “event” element should contain no more than one each of the elements “open”, “high”, “low”, “close”, and “volume”. The order is not important. An “event” does not need to contain all of these elements, as any of the values could be undefined or unavailable. So each of the financial elements occurs 0 or 1 times in an “event”, in any order.
It is possible, but tedious, to create an XML DTD rule that enumerates all of the possible content options for “event”. Instead, it is simpler to make the DTD stricter than the W3C XML Schema, and have it enforce an (unnecessary) order on the financial elements.
highlight[@style='bold'] xmLP Macro “DTD: event” [ highlight[@style='bold'] #8] =
<!ELEMENT event (open?, high?, low?, close?, volume?)>
highlight[@style='ital'] This macro is also defined in macro #10 (Figure xref 10)
highlight[@style='ital'] This macro is invoked in file #1 (Figure xref 14)
The “event” element is also required to have a “date” attribute to date the values that it contains.
highlight[@style='bold'] xmLP Macro “DTD: date pseudo-definition” [ highlight[@style='bold'] #9] =
<!ENTITY % Date "CDATA">
highlight[@style='ital'] This macro is invoked in macro #10 (Figure xref 10)
highlight[@style='bold'] xmLP Macro “DTD: event” [ highlight[@style='bold'] #10] =
highlight[@style='ital'] {DTD: date pseudo-definition[9], Figure xref 9}
<!ATTLIST event
date %Date; #REQUIRED>
highlight[@style='ital'] This macro is also defined in macro #8 (Figure xref 8)
highlight[@style='ital'] This macro is invoked in file #1 (Figure xref 14)
W3C XML Schema supports the “0 or 1 times each in any order” rule using “xsd:all”.
highlight[@style='bold'] xmLP Macro “W3C XML Schema: event” [ highlight[@style='bold'] #11] =
<xsd:element name=" highlight[@style='bold'] event"> highlight[@style='bold'] <xsd:complexType> highlight[@style='bold'] <xsd:all> highlight[@style='bold'] <xsd:element ref=" highlight[@style='bold'] open"/> highlight[@style='bold'] <xsd:element ref=" highlight[@style='bold'] high"/> highlight[@style='bold'] <xsd:element ref=" highlight[@style='bold'] low"/> highlight[@style='bold'] <xsd:element ref=" highlight[@style='bold'] close"/> highlight[@style='bold'] <xsd:element ref=" highlight[@style='bold'] volume"/> highlight[@style='bold'] </xsd:all> highlight[@style='bold'] <xsd:attribute name=" highlight[@style='bold'] date" use=" highlight[@style='bold'] required" type=" highlight[@style='bold'] xsd:date"/> highlight[@style='bold'] </xsd:complexType> highlight[@style='bold'] </xsd:element>
highlight[@style='ital'] This macro is invoked in file #3 (Figure xref 18)
To represent a time series, a number of events are contained within a “timeSeries” element. A time series can contain any number of events, even zero. The dates of the events within a time series must be unique, but neither of the schema technologies used here can enforce that condition.
highlight[@style='bold'] xmLP Macro “DTD: timeSeries” [ highlight[@style='bold'] #12] =
<!ELEMENT timeSeries (event*)>
highlight[@style='ital'] This macro is invoked in file #1 (Figure xref 14)
highlight[@style='bold'] xmLP Macro “W3C XML Schema: timeSeries” [ highlight[@style='bold'] #13] =
<xsd:element name=" highlight[@style='bold'] timeSeries"> highlight[@style='bold'] <xsd:complexType> highlight[@style='bold'] <xsd:sequence> highlight[@style='bold'] <xsd:element ref=" highlight[@style='bold'] event" minOccurs=" highlight[@style='bold'] 0" maxOccurs=" highlight[@style='bold'] unbounded"/> highlight[@style='bold'] </xsd:sequence> highlight[@style='bold'] </xsd:complexType> highlight[@style='bold'] </xsd:element>
highlight[@style='ital'] This macro is invoked in file #3 (Figure xref 18)
With all of its required sections now explained, the DTD is assembled from the component macros as follows.
highlight[@style='bold'] xmLP File [ highlight[@style='bold'] #1]: src/timeseries.dtd =
<?xml version="1.0" encoding="utf-8"?> highlight[@style='ital'] {DTD: financial elements[3,6], Figures xref 3, xref 6} highlight[@style='ital'] {DTD: event[8], Figure xref 8} highlight[@style='ital'] {DTD: timeSeries[12], Figure xref 12}
highlight[@style='ital'] Note: this shows how xmLP file macro definitions are “woven” into the documentation. These define the source files that are generated during “tangling”.
This produces the following source file:
<?xml version="1.0" encoding="utf-8"?> <!ENTITY % Decimal "#PCDATA"> <!ELEMENT open (%Decimal;)> <!ELEMENT high (%Decimal;)> <!ELEMENT low (%Decimal;)> <!ELEMENT close (%Decimal;)> <!ENTITY % NonNegativeInteger "#PCDATA"> <!ENTITY % PositiveInteger "CDATA"> <!ELEMENT volume (%NonNegativeInteger;)> <!ATTLIST volume multiplier %PositiveInteger; "1"> <!ELEMENT event (open?, high?, low?, close?, volume?)> <!ENTITY % Date "CDATA"> <!ATTLIST event date %Date; #REQUIRED> <!ELEMENT timeSeries (event*)>
The sample instance file using the DTD (and containing just a single event) requires an appropriate “DOCTYPE” declaration.
highlight[@style='bold'] xmLP File [ highlight[@style='bold'] #2]: src/timeseries-dtd.xml =
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE timeSeries SYSTEM "timeseries.dtd"> <timeSeries> highlight[@style='bold'] highlight[@style='ital'] {Time Series Event Instance[1], Figure xref 1} highlight[@style='bold'] </timeSeries>
This produces the following source file:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE timeSeries SYSTEM "timeseries.dtd"> <timeSeries> <event date="2002-02-20"> <open>85.70</open> <high>92.10</high> <low>81.37</low> <close>86.05</close> <volume multiplier="1000">811786</volume> </event> </timeSeries>
highlight[@style='ital'] Note: the tangled source file was inserted into this document automatically using an acronym.grp acronym XSLT expansion [Extensible Stylesheet Language Transformations] script.
The W3C XML Schema is assembled from the component macros as follows.
highlight[@style='bold'] xmLP File [ highlight[@style='bold'] #3]: src/timeseries.xsd =
<?xml version="1.0" encoding="utf-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> highlight[@style='bold'] highlight[@style='ital'] {W3C XML Schema: financial elements[4,7], Figures xref 4, xref 7} highlight[@style='bold'] highlight[@style='ital'] {W3C XML Schema: event[11], Figure xref 11} highlight[@style='bold'] highlight[@style='ital'] {W3C XML Schema: timeSeries[13], Figure xref 13} highlight[@style='bold'] </xsd:schema>
This produces the following source file:
<?xml version="1.0" encoding="utf-8"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="open" type="xsd:decimal"/> <xsd:element name="high" type="xsd:decimal"/> <xsd:element name="low" type="xsd:decimal"/> <xsd:element name="close" type="xsd:decimal"/> <xsd:element name="volume"> <xsd:complexType> <xsd:simpleContent> <xsd:extension base="xsd:nonNegativeInteger"> <xsd:attribute name="multiplier" default="1" type="xsd:positiveInteger"/> </xsd:extension> </xsd:simpleContent> </xsd:complexType> </xsd:element> <xsd:element name="event"> <xsd:complexType> <xsd:all> <xsd:element ref="open"/> <xsd:element ref="high"/> <xsd:element ref="low"/> <xsd:element ref="close"/> <xsd:element ref="volume"/> </xsd:all> <xsd:attribute name="date" use="required" type="xsd:date"/> </xsd:complexType> </xsd:element> <xsd:element name="timeSeries"> <xsd:complexType> <xsd:sequence> <xsd:element ref="event" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
The sample instance file using the W3C XML Schema (and containing just a single event) requires an appropriate “schemaLocation” declaration (in this case a “noNamespaceSchemaLocation” declaration). The “xmlns:xsi” declaration is suppressed for brevity, but generated in the actual source file.
highlight[@style='bold'] xmLP File [ highlight[@style='bold'] #4]: src/timeseries-schema.xml =
<?xml version="1.0" encoding="utf-8"?> <timeSeries xsi:noNamespaceSchemaLocation="timeseries.xsd"> highlight[@style='bold'] highlight[@style='ital'] {Time Series Event Instance[1], Figure xref 1} highlight[@style='bold'] </timeSeries>
This produces the following source file:
<?xml version="1.0" encoding="utf-8"?> <timeSeries xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="timeseries.xsd"> <event date="2002-02-20"> <open>85.70</open> <high>92.10</high> <low>81.37</low> <close>86.05</close> <volume multiplier="1000">811786</volume> </event> </timeSeries>
What you have read in this section is a literate program which defines and describes the DTD and W3C XML Schema fragments required to handle a real-world problem. The code fragments in the macros appear within a human-readable context that quickly clarifies what those fragments do, why they are needed, and what their limitations are. Being able to view DTD fragments beside their equivalent Schema fragments makes it easy to compare the two approaches in detail.
Having established the nature of a literate program, the way in which the “xmLP” tool supports acronym LitProg can be described. Traditional acronym LitProg tools provide the following:
The advent of XML made it less attractive and less necessary to define and support custom markup languages for literate documents. To take advantage of XML, xmLP takes the following approach:
Primarily, xmLP provides the highlight[@style='bold'] business logic to deal with code macros, both for tangling and weaving. End users can concentrate on writing stylesheets that define the look and layout of their documentation, without having to worry about the semantics of macros and macro invocation, and without having to worry about building cross-reference information for the macros. These things are handled by xmLP.
So, what does the xmLP markup look like? In the literate document from which this paper was woven, the macro corresponding to Figure xref 2 is actually written as
<lp:macro lp:usage=" highlight[@style='bold'] once" lp:final=" highlight[@style='bold'] true"> highlight[@style='bold'] <lp:name> highlight[@style='bold'] DTD: decimal pseudo-definition</lp:name> highlight[@style='bold'] <lp:text> highlight[@style='bold'] <!ENTITY % Decimal "#PCDATA"> </lp:text> highlight[@style='bold'] </lp:macro>
xmLP defines the following elements and attributes for creating macros. As mentioned previously, xmLP elements and attribute definitions are added to a DTD/Schema/etc. to make them available directly in the authoring of a document. A fragment DTD containing the complete list of xmLP elements and attributes is included in the appendix in Section xref 5.
highlight[@style='bold'] lp:macro |
paraelement: Indicates the definition of an xmLP macro. |
highlight[@style='bold'] lp:usage |
paraattribute: One of “never” or “once” (the default) or “multiple”. Used to indicate how often the macro is to be invoked (used). This proves to be a valuable quality-control (sanity checking) measure, and is taken from “FunnelWeb”. |
highlight[@style='bold'] lp:final |
paraattribute: One of “true” (the default) or “false”. If true, only this macro definition can have the given “lp:name”. If false, all macro definitions with the same “lp:name” are concatenated in document order to fully define the macro. Once again, taken from “FunnelWeb”. |
highlight[@style='bold'] lp:name |
paraelement: The name of the macro being defined. In principle, the name may contain XML elements, so that MathML expressions and the like can be used in macro names. In practice, xmLP currently applies the XPath “normalize-space” function to the macro name to generate a simple text name that is then used to decide whether two macro definitions have the same macro name or not. This is not ideal, but sufficiently good for most purposes. |
highlight[@style='bold'] lp:text |
paraelement: Indicates a plain text component of the macro definition. |
Invoking (calling or expanding) a macro is done with another xmLP element, “lp:invoke”. This element stands in place of the macro being called, and is entirely replaced by it during tangling (assembly of the generated source code files). Taking as an example the macro defined in Figures xref 3 and xref 6, this macro is written over highlight[@style='bold'] two concatenated definitions (note that “lp:final” is set to false), and uses “lp:invoke” to insert the contents of the macro defined in Figure xref 5:
<lp:macro lp:usage=" highlight[@style='bold'] once" lp:final=" highlight[@style='bold'] false"> highlight[@style='bold'] <lp:name> highlight[@style='bold'] DTD: financial elements</lp:name> highlight[@style='bold'] <lp:text> highlight[@style='bold'] <lp:invoke> <lp:name> highlight[@style='bold'] DTD: decimal pseudo-definition</lp:name> </lp:invoke> highlight[@style='bold'] <!ELEMENT open (%Decimal;)> <!ELEMENT high (%Decimal;)> <!ELEMENT low (%Decimal;)> <!ELEMENT close (%Decimal;)> </lp:text> highlight[@style='bold'] </lp:macro>
<lp:macro lp:usage=" highlight[@style='bold'] once" lp:final=" highlight[@style='bold'] false"> highlight[@style='bold'] <lp:name> highlight[@style='bold'] DTD: financial elements</lp:name> highlight[@style='bold'] <lp:text> highlight[@style='bold'] <lp:invoke> <lp:name> highlight[@style='bold'] DTD: integer pseudo-definitions</lp:name> </lp:invoke> highlight[@style='bold'] <!ELEMENT volume (%NonNegativeInteger;)> <!ATTLIST volume multiplier %PositiveInteger; "1"> </lp:text> highlight[@style='bold'] </lp:macro>
highlight[@style='bold'] lp:invoke |
paraelement: Invokes an xmLP macro by name, replacing the “lp:invoke” element completely with the macro contents. |
As well as plain text, xmLP macros can contain well-formed XML using “lp:xml”. As previously mentioned, well-formed XML fragments remove the risk of unmatched open or close tags. You can use plain text fragments (“lp:text”) to generate XML if you want, and it is sometimes useful to do so, but you take the risk of having unmatched tags in your generated XML source files. The following example corresponds to the output shown in Figure xref 1.
<lp:macro lp:usage=" highlight[@style='bold'] multiple" lp:final=" highlight[@style='bold'] true"> highlight[@style='bold'] <lp:name> highlight[@style='bold'] Time Series Event Instance</lp:name> highlight[@style='bold'] <lp:xml> highlight[@style='bold'] <event date=" highlight[@style='bold'] 2002-02-20"> highlight[@style='bold'] <open> highlight[@style='bold'] 85.70</open> highlight[@style='bold'] <high> highlight[@style='bold'] 92.10</high> highlight[@style='bold'] <low> highlight[@style='bold'] 81.37</low> highlight[@style='bold'] <close> highlight[@style='bold'] 86.05</close> highlight[@style='bold'] <volume multiplier=" highlight[@style='bold'] 1000"> highlight[@style='bold'] 811786</volume> highlight[@style='bold'] </event> highlight[@style='bold'] </lp:xml> highlight[@style='bold'] </lp:macro>
highlight[@style='bold'] lp:xml |
paraelement: Indicates a well-formed XML component of the macro definition. |
xmLP uses the “lp:file” element to distinguish top-level macros that define output source files (there can only be one such macro for each output source file). These file macros have a file name rather than a macro name. Note that “lp:file” macros cannot be invoked by other macros. Namespaces are supported by xmLP using “lp:namespace”, as in the following example which corresponds to the output shown in Figure xref 18.
<lp:file lp:filename=" highlight[@style='bold'] src/timeseries.xsd"> highlight[@style='bold'] <lp:namespace lp:value=" highlight[@style='bold'] http://www.w3.org/2001/XMLSchema" lp:prefix=" highlight[@style='bold'] xsd"/> highlight[@style='bold'] <lp:text> highlight[@style='bold'] <?xml version="1.0" encoding="utf-8"?> </lp:text> highlight[@style='bold'] <lp:xml> highlight[@style='bold'] <xsd:schema> highlight[@style='bold'] <lp:invoke> <lp:name> highlight[@style='bold'] W3C XML Schema: financial elements</lp:name> </lp:invoke> highlight[@style='bold'] <lp:invoke> <lp:name> highlight[@style='bold'] W3C XML Schema: event</lp:name> </lp:invoke> highlight[@style='bold'] <lp:invoke> <lp:name> highlight[@style='bold'] W3C XML Schema: timeSeries</lp:name> </lp:invoke> highlight[@style='bold'] </xsd:schema> highlight[@style='bold'] </lp:xml> highlight[@style='bold'] </lp:file>
highlight[@style='bold'] lp:file |
paraelement: Indicates the definition of an xmLP highlight[@style='bold'] file macro. |
highlight[@style='bold'] lp:filename |
paraattribute: The file name (or path) of the file being defined. |
highlight[@style='bold'] lp:namespace |
paraelement: Indicates that a namespace declaration should be added to the tangled XML. |
highlight[@style='bold'] lp:prefix |
paraattribute: The namespace prefix to use. |
highlight[@style='bold'] lp:value |
paraattribute: The namespace identifier (typically a URI). |
To support W3C XML Schema, it is sometimes necessary to specify a Schema location using “lp:schemaLocation”, as in the following example which corresponds to the output shown in Figure xref 20.
<lp:file lp:filename=" highlight[@style='bold'] src/timeseries-schema.xml"> highlight[@style='bold'] <lp:schemaLocation lp:namespace=" highlight[@style='bold'] " lp:location=" highlight[@style='bold'] timeseries.xsd"/> highlight[@style='bold'] <lp:text> highlight[@style='bold'] <?xml version="1.0" encoding="utf-8"?> </lp:text> highlight[@style='bold'] <lp:xml> highlight[@style='bold'] <timeSeries> highlight[@style='bold'] <lp:invoke> <lp:name> highlight[@style='bold'] Time Series Event Instance</lp:name> </lp:invoke> highlight[@style='bold'] </timeSeries> highlight[@style='bold'] </lp:xml> highlight[@style='bold'] </lp:file>
highlight[@style='bold'] lp:schemaLocation |
paraelement: Indicates that a (W3C XML) Schema location declaration should be added to the tangled XML. |
highlight[@style='bold'] lp:namespace |
paraattribute: The namespace identifier (typically a URI). Can be empty. |
highlight[@style='bold'] lp:location |
paraattribute: The Schema location URI. |
The current implementation of xmLP (version 1.1) is written as 600 lines of acronym XSLT (plus stylesheets for particular formats like acronym XHTML). This may change in future implementations. A potential improvement to xmLP would be to introduce parameterized macros (in the manner of “FunnelWeb”), but it is yet to be decided whether this is best done in acronym XSLT or in a more general purpose programming language.
This paper has introduced literate programming, and indeed this paper is a literate program itself. It has demonstrated how programs (and other source files) can be defined within the natural flow of a human-readable document, rather than in the flow defined by a compiler. It has also introduced a simple acronym LitProg tool, xmLP, which can be used to turn any XML document into a literate document.
The source files for this paper will be available at
http://xmLP.sourceforge.net/2002/extreme/
This DTD fragment is non-normative.
<?xml version='1.0' encoding='UTF-8' ?> <!-- PUBLIC "+//IDN xmLP.org//DTD Sample Module for xmLP//EN" --> <!-- The name of an "xmLP" macro. --> <!ELEMENT lp:name ANY> <!-- An invocation of an "xmLP" macro. --> <!ELEMENT lp:invoke (lp:name)> <!-- Text within an "xmLP" macro. --> <!ELEMENT lp:text (#PCDATA | lp:invoke)*> <!-- Balanced XML within an "xmLP" macro. --> <!ELEMENT lp:xml ANY> <!-- An "xmLP" macro. --> <!ELEMENT lp:macro (lp:name , lp:namespace* , (lp:text | lp:xml)*)> <!ATTLIST lp:macro lp:usage (never | once | multiple ) 'once' lp:final (true | false ) 'true' > <!-- An "xmLP" namespace declaration. --> <!ELEMENT lp:namespace EMPTY> <!ATTLIST lp:namespace lp:prefix NMTOKEN #REQUIRED lp:value CDATA #REQUIRED > <!-- An "xmLP" schemaLocation declaration. --> <!ELEMENT lp:schemaLocation EMPTY> <!ATTLIST lp:schemaLocation lp:namespace CDATA #REQUIRED lp:location CDATA #REQUIRED > <!-- An "xmLP" output file. --> <!ELEMENT lp:file ((lp:namespace | lp:schemaLocation)* , (lp:text | lp:xml)*)> <!ATTLIST lp:file lp:filename CDATA #REQUIRED > <!-- "xmLP" block elements. --> <!ENTITY % lpBlock "lp:macro | lp:file">
bib [CWEB WWW] pub CWEB [LitProg tool], http://sunburn.stanford.edu/~knuth/cweb.html
bib [FunnelWeb WWW] pub FunnelWeb [LitProg tool], http://www.ross.net/funnelweb/
bib [FWEB WWW] pub FWEB [LitProg tool], http://w3.pppl.gov/~krommes/fweb_toc.html
bib [Knuth 92] pub “Literate Programming” by Donald Knuth, 1992, ISBN 0-937073-80-6, http://www-cs-faculty.stanford.edu/~knuth/lp.html
bib [Knuth WWW] pub Donald Knuth, http://www-cs-faculty.stanford.edu/~knuth/
bib [LP WWW] pub Literate Programming, http://www.literateprogramming.com/, http://www.loria.fr/services/tex/english/litte.html
bib [noweb WWW] pub noweb [LitProg tool], http://www.eecs.harvard.edu/~nr/noweb/#top
bib [SWEB WWW] pub SWEB [LitProg tool], http://tigger.uic.edu/~cmsmcq/tech/sweb/sweb.html
bib [xml-litprog-l] pub xml-litprog-l [mailing list], http://groups.yahoo.com/group/xml-litprog-l/
bib [xmLP WWW] pub xmLP [LitProg tool], http://xmLP.sourceforge.net/