Mulberry Technologies, Inc. logo

Papers and Presentations
by Mulberry Staff

We have given numerous talks on XML, analysis, XSLT and XPath, and SGML at conferences, as well as to industry and user groups in the US, Canada, Asia, Australia, and Europe. We have also published articles in several industry publications. Some of Mulberry’s recent papers, public presentations, and publications (and a few old favorites) are described below.

And we “walk the walk”: our presentations are developed in XML, with either or both PDF and HTML “slide” renditions. Selected descriptions below have links to the slides, sometimes in PDF, sometimes HTML.

2016

Graceful tag set extension
B. Tommie Usdin, Deborah A. Lapeyre, Laura Randall, and Jeffrey Beck (Balisage 2016, North Bethesda MD)

It’s well understood that inventing a new tag set or XML vocabulary is time-consuming, complicated, and expensive. Not only must the semantics of the tag set be established, but an ecosystem of tools to support that vocabulary must be designed and built. Consequently, organizations often choose to adapt an existing vocabulary to suit their particular needs rather than starting from scratch. This decision is expected to have immediate benefits: a shorter development cycle, simpler customization, and reduced costs. Sometimes this is the case. Some changes lead to compatible documents that interoperate gracefully with existing tools. But this is not always the case. The authors explore when and how vocabulary changes will be compatible and when they will be disruptive. [Abstract only]

So You Want to Adopt JATS. What Decisions Do You Need To Make?
B. Tommie Usdin (JATS-Con, April 2016, Bethesda, MD)

Newcomers to JATS need to make decisions about which tag set to use (Authoring, Publishing, or Archiving), which table model to adopt, and how to handle math. In addition, they should consider citation model and style, contributor names and affiliations, alternative languages and encodings, and adoption of tagging guidelines from PMC, JATS4R, and/or their publishing partners. [Link to the paper]

Citing Data with JATS
Deborah A. Lapeyre (Force11 Data Citation Implementation Pilot (DCIP) Project Kick-Off Workshop, February 2016, Boston, MA)

Data needs to be cited as a primary resource. The ANSI/NISO Z39.96-2015 Journal Article Tag Suite (JATS) is a widely used XML tag set for marking up journal articles. After a brief introduction to JATS, the presentation focused on how to cite data within JATS-tagged journal articles. Noting Force11’s recommendation that data be cited as bibliographic references, the talk explained JATS’ “mixed” citation element and its constituent elements that can be used to cite data. New elements were added to JATS at the request of Force11’s to make better data citations. JATS’ limitations vis-a-vis machine resolvability were also discussed. The talk concluded with examples of JATS data citations for the Dryad Digital Repository, a GenBank Protein, an RNA Sequence, Figshare data, etc. An appendix with mapping of data fields to JATS elements is also included. [Link to the presentation slides (titled “Using JATS to Cite Data in Journal Articles”]

2015

Manipulating XML Content: The Concepts and Practice of XSLT (post-conference tutorial)
Debbie Lapeyre (eXtyles User Group Meeting 2015, Cambridge, MA)

XSLT is comparatively easy transformation tool that turns XML into “something else”, giving you the flexible single-source publishing you want. If you have XML, you need XSLT!

Using XSLT you can: convert XML files into display formats (such as HTML and eBook); make XML into tool-specific formats (such as typesetting languages); extract just a little of your XML for a catalog, report, or shopping cart; and in the process automatically add numbering, cross-reference links, tables of contents, and generated text. XSLT can convert documents tagged according to your XML into documents tagged according to someone else’s tag set.

XSLT is a programming language (sounds scary), but you can read it, write it, and understand it WITHOUT being a programmer. If you are a programmer, you need to know that XSLT is different but not difficult. If you’re a manager, you need to know what XSLT is good for. XSLT is easy to learn if you start with the data model and the processing model behind it. This introduction discusses the principles that underlie XSLT, introduces the data and processing models; demonstrates some XSLT transformations to highlight the sorts of things it is good for; and describes how XSLT is being used in a variety of environments.

Hands-on Introduction to XML
Debbie Lapeyre (XML Summer School, September 2015, St Edmund Hall, University of Oxford)

This 3-day course is designed to introduce a student to aspects of XML basic principles, design, tagging, processing, Quality Assurance, and web delivery. This practical tutorial is taught as a series of hour and a half segments: each segment begins with a one hour lecture and ends with a half hour for hands-on exercises in the topic just discussed. Ms. Lapeyre wrote and teaches the lecture and exercises for both the two XSLT segments and the Schematron segment, as well as assists with the XML schema segments (DTD, XSD, and ReLAX NG).

The art of the elevator pitch
B. Tommie Usdin (Balisage 2015, North Bethesda MD)

Many of us at Balisage feel that the universe (or our organization, sponsor, client, or mother-in-law) doesn’t sufficiently appreciate or respect technologies we know could significantly improve the world. XSLT, techniques for processing overlap, DITA, XQuery, HTML5, even XML, are not given the attention they deserve. People aren’t listening! This is our fault, at least in part. We as a community need to learn to say less and communicate more, and more persuasively. [Abstract only]

Including Data in the Standardized Markup for Journal Articles (NISO JATS)
Deborah A. Lapeyre (Dataverse Workshop on Common Models and APIs for Data Publishing and Citation, June 2015, Cambridge, MA)

The ANSI/NISO Z39.96-2015 Journal Article Tag Suite (JATS) is a widely used XML tag set for marking up journal articles. New Force 11 DataCite work adds new elements to JATS, making it easier than before to cite data sources. Several different types of data and datasets are tagged as examples. Since JATS is a descriptive rather than prescriptive tag set, there are always multiple ways to tag any one construct. An appendix illustrates potential mappings of data citation fields to JATS elements. [Link to the presentation slides (titled “Citing Data in Journal Articles using JATS”]

Superimposing Business Rules on JATS
B. Tommie Usdin, Deborah A. Lapeyre, and Carter M. Glass (JATS-Con, April 2015, Bethesda, MD)

Publishers are stuck between a rock and a hard place. They want to use JATS for interchange but they want their model to help them maintain consistency and enforce their business rules, which JATS does not. We suggest a Schematron layer so they can have it both ways without having multiple models (a notion many publishers find confusing) or needing to transform their content on export (which many content creators find terrifying). [Link to the paper]

What’s New in JATS since 1.0? (tutorial)
B. Tommie Usdin and Deborah Lapeyre (JATS-Con, April 2015, Bethesda, MD)

There have been a number of updates made to the JATS article models since NISO Z39.96 was released officially in August of 2012. These updates comprise three Committee Draft releases: 1.1d1, December 2013; 1.1d2, December 2014; and 1.1d3, anticipated in Spring 2015.

We discuss in detail the changes specified in these three Committee Drafts (which we are expecting to be in the next official release). New capabilities include:

In addition to discussing the new capabilities we will provide examples of each in use and answer questions about how they could/should be used.

2014

Schematron for QA and Reporting (post-conference tutorial)
Debbie Lapeyre (eXtyles User Group Meeting 2014, Cambridge, MA)

Schematron is a language for Quality Assurance and for ad hoc reporting on XML documents and collections of XML documents. With Schematron, users can identify all documents or portions of documents that have, or don’t have, a particular structure, value, or pattern. For example, a user can find all of the documents with sections that lack titles (or have empty titles), list all of the documents that have more than 25 figures, or show all of the figures that were never referenced. Schematron can identify things that may be valid XML but that are often incorrect, or at least worth examination. Schematron can be used for reporting as well; for example, how many of the authors in a particular journal are members of the sponsoring society, or which articles in an encyclopedia have no citations less than five years old.

Even if you already use a DTD or XSD, Schematron can provide a fast, customizable means of validating your XML even further, with pinpoint reporting of values and value ranges, as well as the presence or absence of elements and attributes, while also checking co-constraints and other hard-to-crack edge cases.

Hands-on Introduction to XML
Debbie Lapeyre (XML Summer School, September 2014, St Edmund Hall, University of Oxford)

Designed to introduce the many and varied aspects of XML design, processing, and delivery, the three-day course offers the opportunity to gain an understanding of the latest XML tools and technologies in the marketplace. In hands-on exercises, students learn about content marked up in XML, validation using XML schemas, transformation with XSLT, output pagination with XSL-FO, and searching with XPath and XQuery. Additional topics include transferring structured data between applications, metadata and knowledge in XML (the Semantic Web), and service oriented architectures (web services).

When the “One Size Fits Most” tagset doesn’t fit you
B. Tommie Usdin (JATS-Con, April 2014, Bethesda, MD)

JATS does not actually claim to be a “one size fits all” specification. However, many information content consumers (libraries, archives, on-line services) accept only content that is valid to one of the JATS models, and in many cases specify a subset of the model defined in one of the JATS instantiations (Archiving, Publishing, or Authoring). Thus, content creators find that their vendors and tools often assume that they will be using one of the JATS models “out of the box”. This can present a real problem when a publisher has, and wants, information that is not modeled in JATS, or is not modeled in the JATS DTD their vendors and publishing partners require. In this case, the publisher has several options: Drop the inconvenient information; use “Custom Metadata”, hide the inconvenient information in prose, abuse a tag, suggest a modification of the standard, or modify the tag set to encode the information that matters to you. None of these options are ideal, and which to choose in large part depends on circumstances. [Link to the paper]

Introduction to the Book Interchange Tag Suite (tutorial)
B. Tommie Usdin and Deborah Lapeyre (JATS-Con, March 2014, Bethesda, MD)

The Book Interchange Tag Suite is an XML model for STM books that is based on the Journal Article Tag Suite (JATS; ANSI/NISO Z39-96-2012). The intent of the BITS is to provide a common format in which publishers and archives can exchange book content, including book parts such as chapters. The Suite provides a set of XML schema modules that define elements and attributes for describing the textual and graphical content of books and book components as well as a package for book part interchange.

This half-day detailed “Introduction to the Book Interchange Tag Suite” will cover the major differences between JATS and BITS and discuss the book-specific features of the BITS. Among the topic covered will be:

2013

JATS and BITS Update
Debbie Lapeyre (eXtyles User Group Meeting 2013, Cambridge, MA)

Exciting News! The Journal Article Tag Suite (JATS) has just become an ANSI/NISO standard (ANSI/NISO Z39.96:2012). This latest release of JATS permits users to choose MathML 2.0 or MathML 3.0 for their documents. Progress has also been made in internationalization for JATS, including such key features as Ruby tagging and the ability to describe the name of an author or an institution in more than one language or script. Users were invited to join the JATS effort and submit questions and requests through the NISO comment site. The Tag Library documentation has also been updated and the new look and feel was demonstrated.

In addition, there has been great progress in a JATS-based tag set for books, called Book Interchange Tag Suite (BITS). BITS is not yet at NISO, still an National Library of Medicine (NLM) initiative. BITS is an STEM XML tag set intended for use by publishers who are already tagging their journal articles with JATS. All the lower-level structures (paragraphs, figures, equations, tables, etc.) are the same in JATS and BITS, so a journal article could be easily transformed into the chapter of a book by adding little book-specific metadata. BITS adds non-journal elements such as Indexes, Tables of Contents, questions and answers, and series metadata to JATS. BITS can be used for an entire book or to tag just a chapter. [Link to the presentation slides]

Hands-on Introduction to XML
Debbie Lapeyre (XML Summer School, September 2013, St Edmund Hall, University of Oxford)

Designed to introduce the many and varied aspects of XML design, processing, and delivery, the three-day course offers the opportunity to gain an understanding of the latest XML tools and technologies in the marketplace. In hands-on exercises, students learn about content marked up in XML, validation using XML schemas, transformation with XSLT, output pagination with XSL-FO, and searching with XPath and XQuery. Additional topics include transferring structured data between applications, metadata and knowledge in XML (the Semantic Web), and service oriented architectures (web services).

The semantics of “semantic”
B. Tommie Usdin (Balisage 2013, Montréal)

There was a time when I knew what the word “semantic” meant. That was a long time ago. Since then many people, on many occasions, in many contexts, have corrected my misunderstanding of the meaning of semantic. Perhaps it means nothing, or everything. Or perhaps I’m simply misinformed.

JATS: A New Standard from an Old Specification
Jeffery Beck and B. Tommie Usdin (Information Standards Quarterly, Spring 2013, 25(1): 19-21)

The Journal Article Tag Suite (JATS) is a description of a set of elements and attributes that is used to build XML models of journal articles for archiving, publishing, and authoring. JATS became an American National Standard (ANSI/NISO Z39.96-2012) in August 2012, but it was already a well established specification (known by the colloquial name “NLM DTD”) by the time work began on standardization in late 2009. [Link to the paper]

2012

BITS — JATS for Books
B. Tommie Usdin and Deborah A. Lapeyre (Mulberry’s Seminar Series)

The National Library of Medicine has just announced the public availability of a new XML model: BITS (Book Interchange Tag Suite). BITS, now in draft for public comment, has been designed to meet the needs of publishers who are using JATS for journal articles and want to process their books in similar XML. JATS (the Journal Archiving Tag Suite) has been widely adopted for the XML encoding and interchange of journal articles, and has recently become an ANSI/NISO Standard (ANSI/NISO Z39.96-2012). The JATS models work well for much of the body of books, but there are significant differences between journal articles and books, which the new BITS model accommodates. This seminar will discuss the unique features of books that are modeled in BITS.

Introduction to Schematron (tutorial)
Deborah Lapeyre (JATS-Con, October 2012, Bethesda, MD)

This three-hour tutorial discusses Schematron, a rules-based validation/reporting language that works by making assertions about patterns found in XML documents and reporting back messages about the truth (or otherwise) of those assertions. Whether you are using XSD, DTD, or RELAX NG, there are some validations that those grammar-based schema languages just cannot express, or which, for practical or business reasons, you do not want to build into your basic XML models. Schematron can supplement your schema validation with targeted reporting on elements and attributes, testing their presence, absence, values or value ranges, checking co-constraints and other tricky situations, and warning about suspect occurrences that require further examination. To express its rules, Schematron relies on XPath, the tree-walking and expression language used with XQuery and XSLT.

Mapping JATS to RDF using the SPAR (Semantic Publishing and Referencing) Ontologies
Silvio Peroni, Deborah Lapeyre, and David Shotton (JATS-Con, October 2012, Bethesda, MD)

We will present a mapping of the metadata and bibliographic references from the Journal Article Tag Suite (JATS) to RDF, using the SPAR (Semantic Publishing and Referencing) ontologies together with elements from other well-known vocabularies. This mapping will permit XML documents marked up using JATS to be converted automatically to RDF, enabling the information contained within those documents to be published to the Semantic Web in a manner that is (hopefully) unambiguous and universally understood. By so doing, we hope to facilitate the publication of bibliographic information on the web as linked open data and to enhance the toolkit for libraries, archives, and publishers who have chosen to encode their journal material in NISO JATS. [Link to the paper]

Hands-on Introduction to XML
Debbie Lapeyre (XML Summer School, September 2012, St Edmund Hall, University of Oxford)

Designed to introduce the many and varied aspects of XML design, processing, and delivery, the three-day course offers the opportunity to gain an understanding of the latest XML tools and technologies in the marketplace. In hands-on exercises, students learn about content marked up in XML, validation using XML schemas, transformation with XSLT, output pagination with XSL-FO, and searching with XPath and XQuery. Additional topics include transferring structured data between applications, metadata and knowledge in XML (the Semantic Web), and service oriented architectures (web services).

Things change, or, the “real meaning” of technical terms
B. Tommie Usdin (Balisage 2012, Montréal)

Vocabulary is slippery, especially the sorts of technical jargon we are immersed in at events like Balisage. When we want to talk about a new idea, process, specification, or procedure we have two choices: make up a new word or use a word that is already in use to mean something else. New words may be difficult to remember and awkward to use. Re-purposing an existing word may cause confusion between the “old” and your “new” meaning. In either case, usage of terms changes. The usage of a technical term may mutate over time and may evolve differently in different communities. At times it is useful for a community to pressure users to use terms to mean what they meant when coined, but more often it is simple pedantry to insist that any usage other than that of the person who first introduced the term is incorrect. Our challenge is in finding that balance.

Luminescent: Parsing LMNL by XSLT upconversion
Wendell Piez (Balisage 2012, Montréal)

Among attempts to deal with the overlap problem, LMNL (Layered Markup and Annotation Language) has attracted its share of attention but has also never grown much past its origins as a thought experiment. LMNL's conceptual model differs from XML's, and by design its notation also differs from XML's. Nonetheless, a pipeline of XSLT transformations can parse LMNL input and construct an XML representation of LMNL, with the resulting benefit that further XML tools can be used to analyze and process documents originating from the alien notation. The key is to regard the task as an upconversion: structural induction performed over plain text.

Data Modeling in the Humanities: Three Questions and One Experiment
Wendell Piez (Workshop on Data Modeling and Knowledge Organization in the Humanities, March 2012, Brown University, Providence, RI)

Thinking about data modeling in the humanities leads directly to paradoxical questions regarding digital data, textual media, and their proper or possible relations in a system of representation. Rather than answer these questions directly, Wendell Piez poses three more. What do we mean by “data model”, and in particular, how can a data model be designed to support processes and methods that must be underspecified insofar as they are protean, contested, responsive to exigencies, and themselves objects of investigation? What about markup, and what is the relation of our data model to markup technologies? And what is the potential role of the schema, as an instrument of operations and transformations that can enable open-ended and experimental work? In order to help explore these issues, Wendell will demonstrate a prototype toolkit, parsing a markup syntax capable of representing arbitrary overlapping ranges and providing them with structured annotations. [Link to the presentation slides]

2011

Introduction to Multi-language Documents in NISO JATS
Deborah A. Lapeyre (JATS-Con, September 2011, Bethesda, MD)

The current JATS includes several structures that support encoding documents in which (some) metadata or text is provided in more than one language. In addition to the practically ubiquitous xml:lang attribute, there are elements specifically designed to contain multiple languages. Many metadata elements have been made repeatable and given an xml:lang language attribute, so that they can be present in the metadata once for each language. This introduction will explain the mechanisms for marking up multi-language content as well as examples of their use. You will learn how to encode an author's name in several languages (or language/script combinations) without creating the false impression that these variations represent additional authors. You will learn how to encode a table or a figure in several languages. Tagged metadata examples will illustrate the use of translated abstracts, titles and subtitles, keywords, and journal titles. Citation examples will illustrate multiple sources, reference authors in several scripts, and more. [Link to the paper]

Taming the Beast: JATS data, non-JATS data, and XML Namespaces
Wendell Piez (JATS-Con, September 2011, Bethesda, MD)

An introduction to basic concepts, gotchas, and rules of thumb for working with namespaces in JATS documents and processing systems, addressing questions including the following: What are namespaces in XML, why do I need them, and why am I so confused? What can I do about this? Can I avoid namespaces altogether? (Yes, sometimes, but mostly no, not in the real world.) If I can't avoid them, how do I work with them, and what practices do I follow so as to understand what's going on in my data, recognize and fix problems when they arise, and prevent them from ever arising? What are the rules of good namespace hygiene? [Link to the paper]

The future of the JATS: the probable, the possible, and the unlikely
B. Tommie Usdin (JATS-Con, September 2011, Bethesda, MD)

In the year since JATS-Con 2010 the JATS has made one major step (becoming a draft NISO Standard) and many small steps (changes to the Tag Suite itself). This is a good time to think about the future of the JATS. Will it, and should it, be basically stationary from now on? Will it be gracefully and gradually extended? What sorts of changes are probable? Are there revolutionary changes we should be thinking about? [Link to the paper]

Hands-on Introduction to XML
Debbie Lapeyre (XML Summer School, September 2011, St Edmund Hall, University of Oxford)

Designed to introduce the many and varied aspects of XML design, processing, and delivery, the three-day course offers the opportunity to gain an understanding of the latest XML tools and technologies in the marketplace. In hands-on exercises, students learn about content marked up in XML, validation using XML schemas, transformation with XSLT, output pagination with XSL-FO, and searching with XPath and XQuery. Additional topics include transferring structured data between applications, metadata and knowledge in XML (the Semantic Web), and service oriented architectures (web services).

Serendipity
B. Tommie Usdin (Balisage 2011, Montréal)

Conferences are ostensibly structured around presentations, papers, and posters; and these are key to the success of any conference. By common agreement, the informal aspects of conferences — lunch, coffee breaks, and overheard conversations — are of lesser importance. Balisage produces persistent electronic proceedings, but the conference itself is face to face only. It is human interaction and serendipity that provide the most valuable, You are likely to attend both presentations you know yourself to be interested in and talks on topics you know little about. You may expand your areas of interest; you may learn something useful. A talk about a topic totally foreign to you may prompt you to think of a solution to one of your current problems or a new approach to a long-standing problem. With luck, you will also make a few new friends, connect with some old ones, make a helpful suggestion to a fellow participant, and have some fun in Montreal.

Generic microformats for coverage, comprehensiveness, and adaptability
Wendell Piez (Balisage 2011, Montréal)

The major descriptive XML formats for publishing applications all have an Achilles' heel: their means of achieving breadth of coverage and adapting to local requirements. Many projects avoid schema extensibility mechanisms, which fork the local application from the core tag set, complicating implementation, maintenance, and document interchange and thus undermining many of the advantages of using a standard. Yet the easy alternative, creatively reusing and abusing available elements and attributes, is even worse: it introduces signal disguised as noise, degrades the semantics of repurposed elements and hides the interchange problem without solving it. This dilemma follows from the way we conceive of our models for text. If designing an encoding format for one work must compromise its fitness for any other, we will always be our own worst enemies. Reconsidering our approach to descriptive encoding, we can see a solution: supplement our current mechanisms with abstract generic elements designed specifically to support extensibility not in the schema but in the document instance, providing for bottom-up development, as microformats, of new semantic types.

2010

Fitting the Journal Publishing 3.0 Preview Stylesheets to Your Needs: Capabilities and Customizations
Wendell Piez (JATS-Con, November 2010, Bethesda, MD)

An introduction to the NCBI/NLM Journal Publishing 3.0 Preview XSLT stylesheets, which provide for basic styled display of Journal Publishing 3.0 data, in HTML and PDF, with an emphasis on features enabling extension and customization. With demonstrations. [Link to the paper]

Why Create a Subset of a Public Tag Set
Debbie Lapeyre (JATS-Con, November 2010, Bethesda, MD)

The Journal Article Tag Sets were designed as translation targets; they are permissive, descriptive rather than prescriptive, and use escape hatches to preserve as many semantics as possible in born-digital XML content that originates in another tag set. This means that the Tag Sets — which can describe “almost” anything for “almost” anybody — can be used right out of the box, and many users do just that. But for a publisher (particularly a publisher looking to move XML to earlier stages in a workflow) or for an archive with requirements to regularize archival content, the advantages to subsetting can be substantial. The benefits of subsetting the Tag Sets are discussed, e.g., the ability to leave documents valid to one of the original NLM Tag Sets while at the same time enabling business-specific reporting, Quality Assurance, and XML tool use. [Link to the paper]

XML Summer School (September 2010, St Edmund Hall, University of Oxford)
Debbie Lapeyre

Hands-on Introduction to XML: Through the Looking Glass (Lesson 2.2, an introduction to XSLT and XPath)

An introduction to transformation, the third key component of XML technology (the other two being the XML Language itself and XML Schemas). Following explanations of XSLT’s basic concepts and the XSLT processing model, attendees learn how to (1) use XSLT to transform source XML structures to produce XML and non-XML output and (2) use the XPath standard with XSLT to locate nodes in the XML source.

Hands-on Introduction to XML: Transformers, transform! (Lesson 2.3, transformations in action)

An overview of the more advanced features of XSLT (including some introduced in XSLT 2.0). Following a discussion of XPath location paths, attendees learn how to create XSLT stylesheets using “push” and “pull” approaches. Topics include those relevant to programmers, such as copying nodes from the tree; conditional processing; and variables, parameters, and named templates.

The high cost of risk aversion
B. Tommie Usdin (Balisage 2010, Montréal)

Avoiding risk is not always the way to minimize risk.

Creation of File Formats (in “Simplifying Digital Content: Standards from Creation to Distribution and Access”)
B. Tommie Usdin and Jeff Beck (NISO Update at 2010 ALA Annual Conference, Washington DC)

The goal of the Standardized Markup for Journal Articles Working Group (see www.niso.org/workrooms/journalmarkup/) is to take the currently existing National Library of Medicine (NLM) Journal Archiving and Interchange Tag Suite version 3.0, the three journal article schemas, and the documentation and shepherd it through the NISO standardization process. In April, the group finished working on updates to version 3.0 and began moving to authoring the standard itself. Participants learned more about their work, as well as about discussions that had taken place in NISO about next steps, including potential work to develop a Book DTD. [Link to the presentation]

2009

The National Library of Medicine Tag Suite for Journal Articles: Taking Over the World of XML Journal Publishing
Debbie Lapeyre (XML-in-Practice 2009, Washington, D.C.)

The Nation Library of Medicine’s publicly available “NLM Journal Article Archiving and Interchange Tag Suite” has taken over the world of XML journal publishing. The journal tag sets made from the Suite are better than DocBook for extensive references; more targeted than DITA and TEI to journal articles; flexible enough for content beyond the STM world; and easily adaptable for eBooks. It’s free. It’s customizable. What do the National Library of Medicine, Library of Congress, the British National Library, JStore (Portico), and numerous journal publishers know that you should know? What is the Tag Suite and why is it the de facto journal article XML worldwide? [Downloadable .zip of PDF]

XML Summer School (September 2009, St Edmund Hall, University of Oxford)
Debbie Lapeyre

Hands-on Introduction to XML: Through the Looking Glass (Lesson 2.2, an introduction to XSLT and XPath)

An introduction to transformation, the third key component of XML technology (the other two being the XML Language itself and XML Schemas). Following explanations of XSLT’s basic concepts and the XSLT processing model, attendees learn how to (1) use XSLT to transform source XML structures to produce XML and non-XML output and (2) use the XPath standard with XSLT to locate nodes in the XML source.

Hands-on Introduction to XML: Transformers, transform! (Lesson 2.3, transformations in action)

An overview of the more advanced features of XSLT (including some introduced in XSLT 2.0). Following a discussion of XPath location paths, attendees learn how to create XSLT stylesheets using “push” and “pull” approaches. Topics include those relevant to programmers, such as copying nodes from the tree; conditional processing; and variables, parameters, and named templates.

Standards considered harmful
B. Tommie Usdin (Balisage 2009, Montréal)

Standards and shared specifications allow us to share data, build general purpose tools, and significantly reduce training and customization costs and startup time. That is, the use of appropriate specifications can help us reduce costs, reduce startup time, and increase quality, usability, and reusability of content. Some vigorous standards proponents insist that the more standards used the better. To them I say “mind your own business and let me mind my own store”. They argue that using standards is always the right thing to do, because it enables re-use and interchange. Maybe so. But adoption of a standard that supports an activity that is not central to your mission is a distraction, an unwarranted expense, a bad idea.

How to Play XML: Markup Technologies as Nomic Game
Wendell Piez (Balisage 2009, Montréal)

Projects involving markup technologies are game-like: they have players (teams and individuals), equipment, rules, victories, and defeats. In many of the markup games we play, the making of the game’s rules is part of the game itself. When the playing of a game involves the modification of the game’s own rules, it is said to be a “nomic game”. The process of legislation, for example — including the collaborative development of markup vocabularies and other markup standards — is a nomic game. This meditation considers how the experiences of earlier nomic games are influencing today’s contests, the far-reaching influence today’s nomic games will exert on those to be played later, and things to consider as we engage each other in the nomic games of markup theory and practice.

Summer XML 2009 Conference (July 2009, Raleigh, North Carolina)
Debbie Lapeyre

Introduction to Schematron

Schematron is a small, powerful, and lightweight fact-checker for XML documents. It offers the best error messages in the world — you write them yourself. Whether you are using XSD, DTD, or RELAX NG, there are some validations that those grammar-based schema languages just can’t express, or which, for practical or business reasons, you do not want to build into your basic XML models. Schematron offers a practical way to reach into these corners. Schematron can supplement your schema validation with targeted reporting on elements and attributes, testing their presence, absence, values or value ranges, checking co-constraints and other tricky situations, and warning about suspect occurrences that require further examination. To express its rules, Schematron relies on XPath, the industry-standard query syntax for data retrieval and linking within and among XML documents. This makes it a natural fit with other applications in the XML family of technologies, including XSLT and XQuery; eases development and maintenance; and rewards your organization’s investment in XML expertise with a higher quality product. This session is a presentation, discussion, and demonstration using real-world data, suitable for newcomers to XML-based document production as well as for editors, production staff, and technologists more experienced with XML.

Introduction to XSLT Concepts

You keep hearing that XML is exciting; that once you have your content in XML you can do anything with it; that XML is powerful and flexible. Then you look at an XML file and don’t see what the fuss is all about! XSLT (the XML transformation language) is the language that makes XML powerful and flexible. Using XSLT you can: convert XML into display formats (HTML, PDF, etc.); make XML into tool-specific formats (such as typesetting languages); and automatically add numbering, cross-references, tables of contents, and generated text to your pages. You can also use XSLT to convert documents tagged according to your DTD/schema into documents tagged according to someone else’s tag set! XSLT changes the way you’ll think about XML. This introduction covers the principles of XSLT, its processing model, what it can and can’t do, and how it is being used in real environments. In this brief hands-on tutorial you will run, and then modify, sample XSLT Transforms to illustrate the power of XSLT.

2008

LMNL in Miniature
Wendell Piez (Amsterdam Overlap Workshop, December 2008)

Introductory Schematron
Wendell A. Piez (XML-in-Practice 2008)

This tutorial discusses Schematron, a rules-based validation/reporting language that works by making assertions about patterns found in XML documents and reporting back messages about the truth (or otherwise) of those assertions. While Schematron can work with many tree-querying languages, the tutorial illustrates Schematron as it is most commonly used, with XPath, the tree-walking and expression language used with XQuery and XSLT.

An Introduction to Schematron
Wendell Piez and Debbie Lapeyre (Philadelphia XML Users Group)

A short version of Mulberry’s popular Introduction to Schematron, a small, powerful, easy to learn fact-checker for XML documents. Schematron can provide the best error/reporting messages in the world (you craft them for your specific situation) and can be really useful in editing and checking XML. Whether you are using XSD, DTD, or RELAX NG, there are some validations that those grammar-based schema languages just can’t express or that, for practical or business reasons, you do not want to build into your basic XML models. [Link to the presentation]

Cool or Useful
B. Tommie Usdin (Balisage 2008, Montréal)

True versus Useful, or True versus Likely-to-be-useful, are trade-offs we find ourselves making in document modeling and many other markup-related situations all the time. But Cool versus Useful is a far more difficult trade-off, especially since our world now includes a number of very cool techniques, tools, and specifications. Cool toys can have a lot of gravitational pull attracting attention, users, projects, and funding. Unfortunately, there is sometimes a disconnect between the appeal of a particular tool/technology and its applicability in a particular circumstance.

A Non-backwards-compatible Update: A Difficult Decision
Deborah A. Lapeyre (International Symposium on Versioning XML Documents and Vocabularies, Montréal)

The U.S. National Library of Medicine (NLM) Journal/Book Tag Sets have been widely adopted by libraries, archives, and commercial publishers. The users are widely distributed, generally unknown to each other, and in many cases unknown to the Tag Set advisory group, owners, and secretariat. The first five revisions to the Tag Sets were backwards compatible, but the most recent is not. The decision to make a non-backwards-compatible revision was not taken lightly. It was made based on several factors, including a decision to favor the needs of future users over the convenience of current users.

Introductory Schematron
Deborah A. Lapeyre and Wendell A. Piez (DC XML Users Group, January 2008, Washington, D.C.)

This tutorial discusses Schematron, a rules-based validation/reporting language that works by making assertions about patterns found in XML documents and reporting back messages about the truth (or otherwise) of those assertions. While Schematron can work with many tree-querying languages, the tutorial illustrates Schematron as it is most commonly used, with XPath, the tree-walking and expression language used with XQuery and XSLT. [Link to our Schematron page, which includes this tutorial’s slides]

2007

Interview with B. Tommie Usdin, President, Mulberry Technologies
(reprinted from Silverchair’s newsletter, Context Matters, December 2007)

Silverchair interviews Tommie Usdin about her experience with markup languages, the work of Mulberry Technologies, and the use of XML in publishing.

Separating Mapping from Coding in Transformation Tasks
Tommie Usdin and Wendell Piez (XML 2007, Boston)

Creating XML transformations in two separate tasks, Mapping and Coding, not only maximizes the skills of various team members, but also reduces development time and cost, and increases correctness of the finished code. [Link to the presentation slides or our example mapping specification]

Introductory Schematron
Deborah A. Lapeyre and Wendell A. Piez (XML 2007, Boston)

This tutorial discusses Schematron, a rules-based validation/reporting language that works by making assertions about patterns found in XML documents and reporting back messages about the truth (or otherwise) of those assertions. While Schematron can work with many tree-querying languages, the tutorial illustrates Schematron as it is most commonly used, with XPath, the tree-walking and expression language used with XQuery and XSLT. [Link to our Schematron page, which includes this tutorial’s slides]

TEI at 20: Congratulations! The Next 20 Will Tell the Tale
B. Tommie Usdin (Text Encoding Initiative Consortium Members’ Meeting, November 2007, College Park, MD)

At the Text Encoding Initiative Consortium Members’ Meeting (University of Maryland, College Park), B. Tommie Usdin delivers a Keynote presentation discussing the TEI’s accomplishments and influence on the computing world over the last 20 years and posing questions, the answers to which will define the TEI’s goals for the future. [Link to the text of the Keynote]

Riding the Wave, Riding for a Fall, or Just Along for the Ride?
B. Tommie Usdin (Extreme Markup Languages® 2007, Montréal)

Tommie Usdin discusses the implications of XML’s success: whether the work is over (or just starting), whether XML is (or should be) going underground, and whether the markup community has misconceptions about its role in XML’s success.

LMNL (Layered Markup and Annotation Language)
Wendell Piez (International Workshop on Markup of Overlapping Structures, August 2007, Montréal)

As part of a panel discussion, Wendell Piez explores the potential of LMNL as a way to handle overlapping markup.

Form and Format: Towards a Semiotics of Digital Text Encoding
Wendell Piez (Digital Humanities 2007, University of Illinois, Urbana-Champaign)

2006

XSLT for Quality Checking in the Publication Workflow
Wendell Piez (Mulberry’s Seminar Series)

[Link to the seminar slides. Those wishing to download the sample stylesheets demonstrated at the seminar may find the link in the seminar’s penultimate slide (#38).]

The Layered Markup and Annotation Language (LMNL)
John Cowan, Jeni Tennison, and Wendell Piez (Extreme Markup Languages® 2006, Montréal)

A brief report on some design decisions recently made by the Ad Hoc LMNL Group about the LMNL (Layered Markup and aNnotation Language) syntax and design model. A simplified version of layers is presented, along with a review of LMNL that includes previously unpublished material on non-character atoms and namespaces. (Although this paper is not represented in the conference proceedings, an author package is available as part of the proceedings.)

What Is XML and Why Should You Care?
B. Tommie Usdin and Debbie Lapeyre (XPlor Mid Atlantic, April 2006, Miami Beach)

More and more organizations are moving their content to XML. Some are asking for XML as well as pages from their printers; some are sending XML to their printers. This presentation discusses who is moving to XML and what they hope to get from it, as well as how does XML work and how participants should approach XML. The basic vocabulary needed to talk about XML and an overview of the logical components of an XML application are provided. [Link to the HTML presentation copy]

EXPLOR Global (February 2006, Miami Beach)
Tommie Usdin and Debbie Lapeyre

Introduction to XSLT Concepts

You keep hearing that XML is exciting; that once you have your content in XML you can do anything with it; that XML is powerful and flexible. Then you look at an XML file and don’t see what the fuss is all about! XSLT (the XML transformation language) is the language that makes XML powerful and flexible. Using XSLT you can: convert XML into display formats (HTML, PDF, etc.); make XML into tool-specific formats (such as typesetting languages); and automatically add numbering, cross-references, tables of contents, and generated text to your pages. You can also use XSLT to convert documents tagged according to your DTD/schema into documents tagged according to someone else’s tag set! XSLT changes the way you’ll think about XML. This introduction covers the principles of XSLT, its processing model, what it can and can’t do, and how it is being used in real environments. This is a concept course, showing “just enough” syntax. [Link to the HTML Slides for “Introduction to XSLT Concepts” or PDF of Handouts for “Introduction to XSLT Concepts”]

Introduction to XSL-FO Concepts (Printing Directly from XML)

XSL-FO (Extensible Stylesheet Language – Formatting Objects) is a specification for formatting XML documents for print or web display. Publishers, catalog producers, and financial institutions (among many others) are using XSL-FO to go directly from XML into PDF, PostScript, PCL, etc. This conceptual introduction introduces XSL-FO, what it is, how it works, how it can be used, and what it is capable of producing (and what it can’t!). Using a stylesheet-in-development, we illustrate the logical components of an XSL-FO formatting system, how the page geometry works, and show you the basic vocabulary of “formatting objects”: blocks, wrappers, Cascading-Stylesheet-like attributes, and pages. Why isn’t everyone using XSL-FO? Should your company consider it? [Link to the HTML Slides for “Introduction to XSL-FO Concepts” or PDF of Handouts for “Introduction to XSL-FO Concepts”]

What Is XML and Why Should You Care?

XML is a data format that manages text and content as named objects. XML documents with their “tags” can be part of cost-effective solutions for content reuse, repurposing, internationalization, and more. This session provides the vocabulary you need to talk about XML, a look at how XML works, some real world examples, and a glimpse at the logical components of an XML application. [Link to the HTML Slides for “What Is XML and Why Should You Care?” or PDF of Handouts for “What Is XML and Why Should You Care?”]

How and Why Are Companies Using XML?

More and more organizations are moving their content to XML. Some are asking for XML as well as pages from their printers; some are sending XML to their printers. Who is moving to XML, and what do they hope to get from it? How can designers and printers serve their XML customers? If you understand why your customer wants XML and what they want to do with it, you can help them meet their goals, and thus increase your value as a supplier! [Link to the HTML Slides for “How and Why Are Companies Using XML?” or PDF of Handouts for “How and Why Are Companies Using XML?”]

Moving to XML: The Investment

XML has many benefits, but no one ever said it came for free. Moving to XML will change the way you work, the flow of content through your organization, staffing skills (and possibly staffing levels), and the opportunities you have. Where do tags come from and when? What does your staff need to know? Does added value mean added work? How can XML help in QA? What are some of the known problems and pitfalls you might avoid? [Link to the HTML Slides for “Moving to XML: The Investment” or PDF of Handouts for “Moving to XML: The Investment”]

Why XML for Print?

Should your organization be making print publications from XML? The current XML hype is focused on web portals, XML-service-oriented architectures, and e-business applications. But while the use of XML in traditional print publishing may be less trendy and newsworthy, it is equally powerful. Working with XML can help publishers improve quality and timeliness, as well as allowing them to repurpose, reuse, and reformat content from a single source. XML allows publishers to create high-quality print publications using source data that can also support electronic publication, electronic archives, enhanced search and retrieval, and new product opportunities. [Link to the HTML Slides for “Why XML for Print?” or PDF of Handouts for “Why XML for Print?”]

XML in Print Production

Most of the ways of adding XML to print production come down to a variation on one of three themes: making pages then XML, introducing XML during composition, and working with XML from as far in front of composition as you can manage. What are the implications and advantages of each style? Why would you prefer one to another? If you do make XML early in the production cycle, how do you get from XML to pages? There are many methods, each with its own set of pros and cons, that can be used in combination for multiple content reuse. [Link to the HTML Slides for “XML in Print Production” or PDF of Handouts for “XML in Print Production”]

Introduction to XPath 2.0
Wendell Piez and Debbie Lapeyre (DC XML Users Group, January 2006, Washington, D.C.)

An introduction to the concepts and syntax of XPath 2.0 (the new XML query and tree-traversal language — now in Working Draft — from W3C) and the differences between XPath 1.0 and XPath 2.0. The data model has changed; there are powerful new functions and operators; and XPath 2.0 is closer to a programming language than ever. [Link to the HTML presentation copy]

2005

Introduction to XPath 2.0
Wendell Piez (XML 2005, Atlanta)

A tutorial introducing the proposed new W3C XML document query and traversal language. With this well-received tutorial, we provide some sample files for participants to play with. [Downloadable samples (link to a 9Kb .zip file)]

XSLT Throughout the Document Lifecycle
Wendell Piez (XML 2005, Atlanta)

XSLT can be applied to a range of tasks besides generating final output formats, including the automation and semi-automation of editorial and copy-editing chores, extra-schema validation, data aggregation, filtering, indexing, file management, and more.

W3C XML Schema, RELAX NG, Schematron, or DTD: How’s a User To Choose?
B. Tommie Usdin (XML 2005, Atlanta)

XML DTDs and schemas are used to specify what tagging is allowed in a set of XML documents. Originally, XML had only one way to express these rules; now there are many, each of which reflect not only different conceptions of the functional requirements for constraint languages, but also different approaches to meeting those requirements. This talk provides a clear look at the nature and strengths of each of the major schema languages (XML DTD, W3C XML Schema (XSD), RELAX NG, and Schematron), without hype and without advocating any of them. After discussing the uses of XML schemas in general, each language is examined, highlighting its major features; what sorts of constraints (rules) it can, and cannot express; and the environments in which it is most popular. The talk ends with factors in selecting appropriate schema language(s) and a discussion of ways in which many organizations are using multiple schema languages in the same projects to do different tasks. [HTML Slides]

In Praise of the Edge Case
B. Tommie Usdin (Extreme Markup Languages® 2005, Montréal)

The Extreme Markup Languages® conference, the organizers are sometimes told, devotes too much time to edge cases. This complaint inspires reflection on the value of exploring, learning about, and learning from the technological edge. Remember: today’s main stream application was yesterday’s edge case.

Format and Content: Can They Be Separated? Should They Be?
Wendell Piez (Extreme Markup Languages® 2005, Montréal)

An examination of a practical and theoretical question in markup language design, using as a counter-example the unorthodox “Web Graphic Layout Language” project of the author.

2004

Way Beyond Powerpoint
Wendell Piez (XML 2004, Washington, D.C.)

Microsoft PowerPoint is ubiquitous, and therefore controversial. Most critiques, both of the software and of its widespread adoption in educational settings, express concerns that are not particular to PowerPoint alone, but apply to “slideware” presentations generally. The reliance on sequences and hierarchies of bullet points (a poor means of presenting some kinds of complex information), the foregrounding of visual gimmicks over content, the displacement of attention from the speaker and her message onto summary arguments presented dumbly on screen: far from being necessary features of presentation technology, these (according to the critics) prove to be shortcomings that interfere with, rather than enhance, a presenter’s ability to communicate.

This paper presents an alternative to slideware, in the form of SVG graphics used for presentation. Why SVG? It meets all our functional requirements of a presentations technology, but even more importantly, as an XML-based format, Scalable Vector Graphics is well-suited to an XML-based production framework. Going far beyond sequences of bullet points, SVG supports open-ended, innovative uses of visual media in presentation. This becomes practical because the complexities of SVG coding can be relegated to a processing layer, following the classic design pattern of XML publishing. [This paper won the Best Speakers Award at XML 2004. An early version was presented at ALLC/ACH 2004 (Gothenburg, Sweden).]

Half-steps toward LMNL
Wendell Piez (Extreme Markup Languages®, Montréal)

Overlap in markup occurs where some markup structures do not nest, such as where the sentence and phrase boundaries of a poem and the metrical line structure describe different hierarchies. LMNL (Layered Markup and Annotation Language) is a model for representing textual data, designed to recognize and account for layer separation and markup overlap. LMNL is specified as a data model, not as a syntax — but without a syntax and an API, it’s very difficult to experiment with the model. The author demonstrates a subset of LMNL using an XML syntax and some severe restrictions on LMNL (thus “half-LMNL”).

Authoring Scholarly Articles: TEI or Not TEI?
Wendell Piez (ALLC/ACH 2004, Gothenburg, Sweden)

The TEI has grown and matured greatly in recent years, both in the number and breadth of its applications, and in their sophistication. It can be taken as a sign of the success and state of health of TEI to see persistent efforts to push its boundaries. The author discusses one area that is repeatedly cited as where the TEI “should” provide a competitive alternative, but apparently does not: the realm of authoring or original composition by scholars and writers.

2003 – 2002 – 2001

NLM’s Public-domain DTDs: A 9-Month Update
Debbie Lapeyre and Jeff Beck (XML 2003, Philadelphia)

In March 2003, the National Library of Medicine (NLM) released into the public domain a suite of DTD modules for describing journal literature, books, and many kinds of textual material. The full suite was developed by the National Center for Biotechnology Information (NCBI) and the XML consulting firms Inera, Inc. (funded by the Andrew W. Mellon Foundation) and Mulberry Technologies, Inc. (funded by NCBI). Also in March the first two public DTDs developed from this suite were released: the Journal Archiving and Interchange DTD, and the Journal Publishing DTD which defines a common format for the creation of journal content in XML. This presentation discusses use of the DTDs so far, future plans, and the work of the advisory board.

XSL-FO Chefs’ Tools Exhibition
Tommie Usdin, maestro (XML 2003, Philadelphia)

In this technical exhibition of XSL-FO tools, each product representative provided a sample and rendered the samples provided by the other participants. As far as we know, this was the first public demonstration of interchange of typesetting files. The participants received only XSL-FO instances, without any guidance on what the formatted document should look like, and each formatted as many of the samples as they could. None of the tools succeeded with all of the samples, and some of them required manipulation of the documents before they could be rendered at all. At the end of a very exciting demonstration of XSL-FO rendering tools, the conclusions were: XSL-FO rendering is practical for many applications; there are a variety of high-quality XSL-FO tools available; each tool has strengths and weaknesses; and none is clearly superior to all others for all uses.

XSLT for Quality Checking in a Publication Workflow
Wendell Piez (XML 2003, Philadelphia)

Editorial work will always require the judgement of informed and sensitive human beings. Nonetheless, XML-based applications, even at a small scale, can support and complement, rather than detract from, the work of human beings in providing the kind of care and attention to information through the publishing process that is, ultimately, the only thing that can assure the quality of published works. This paper examines, in concrete detail (using the XML behind an XML 2003 conference paper as an example test bed), how one particular XML technology, XSLT, can be brought to bear in such applications.

When “It doesn’t matter” Means “It matters!”
B. Tommie Usdin (Extreme Markup Languages 2002, Montréal)

Few classes of narrative document can be as tightly specified as most business documents can. But many can usefully be specified more tightly than they are. This talk illustrates the costs of underspecifying content models. It is important to recognize the difference between “it doesn’t matter; there is no information here” and “it can’t be specified because the content creator will supply it”.

Human and Machine Sign Systems
Wendell Piez (Extreme Markup Languages 2002, Montréal)

A schema’s role is to mediate and adjudicate between human and machine semantics; recognizing this can help us manage our schemas better. Some practitioners work solely with an operational semantics, according to which the meaning of a tag is what we want it to cause the processing software to do with the data. A better understanding is reached if we adopt the structuralist view that a sign is the (arbitrary) relation between a signifier and a signified. In metalanguages (including schema languages) the signified is itself a sign; in some languages the signifier may likewise be a sign. Proper understanding of the relationship among sign, signifier, signified, metalanguage, and connotative system will allow us to layer our systems more effectively and to obtain useful results even in fluid systems where our understanding of the underlying reality cannot, or should not, be fixed.

The Layered Markup and Annotation Language (LMNL)
Jeni Tennison and Wendell Piez (Extreme Markup Languages 2002, Montréal)

Representing multiple hierarchies within a single document has always been a problem for XML. To try to address the problems of representing multiple hierarchies and of annotating existing tree structures with type information (as in the PSVI), we have developed a layered data model based on the Core Range Algebra presented at Extreme 2002 by Gavin Nicol. This data model views documents as strings over which span a number of named ranges, each of which can themselves have associated metaranges with their own internal structure. To aid experimentation with this data model, we developed a markup notation to reflect it, the Layered Markup and Annotation Language (LMNL), and have constructed several prototype applications to facilitate the extraction of single views, as XML structures, from LMNL documents. (Although this paper is not represented in the conference proceedings, an author package is available as part of the proceedings.)

XML and Print
Debbie Lapeyre (Seybold New York, 2002; other locations previous years)

This tutorial explores the relevance of XML as a data format for creating high-quality print publications that can later support electronic publication, electronic archives, and enhanced search and retrieval. XML’s ability to assist management of multi-author publications, revisions, and approvals; and its potential for fast reuse and repurposing of content are highlighted. [.zip of 2001 version, in PDF]

XML for Publishing Managers
Debbie Lapeyre (Seybold New York, 2002; other locations previous years)

A 3-hour tutorial that starts by defining XML and goes on to explain the benefits of XML applications, the use of XML in multimedia publishing, application integration, information repositories, and database publication. The impact of XML on workflow and staffing is also discussed, as well as the staff skills needed for XML-based data distribution. [.zip of August 2001 version, in PDF]

Document Analysis for DTD or Schema Development
Debbie Lapeyre and Tonya Gaylord (XML 2001, Orlando)

A tutorial on the principles of information analysis. An interactive sample document analysis is used to demonstrate basic concepts of structured markup, the distinction between “useful” versus “possible” information, and the relationships between information components. [Handout in PDF]

From HTML to XML
Wendell Piez (XML 2001, Orlando)

Migrating data from a web format (HTML) into a more versatile and manageable XML format involves a range of decisions based on what shape the source code is in, what kinds of functions and operations the new XML-encoded data needs to be able to support, and design trade-offs between the power and versatility of markup on the one hand, and the expense of tagging and maintenance of strong data on the other. [HTML slides]

Beyond the “Descriptive vs. Procedural” Distinction
Wendell Piez (Extreme Markup Languages 2001, Montréal)

A paper considering markup design strategies from a theoretical point of view. Sometimes “semantic opacity” is a feature, not a bug. Because they sometimes work to mask even while they communicate, markup languages can be usefully considered as a species of rhetoric.

Previous years

A Manager’s Introduction to XML
Wendell Piez (XML 2000, Washington, D.C.)

A tutorial providing a non-technical introduction to XML, including its historical origins and its business application. Also discussed are the XML “family” of standards, i.e., those standards (XSL, XSLT, XSLFO) related to XML. [.zip of PDF]

XSL: Characteristics, Status and Potentials for the Humanities
Wendell Piez (2000 Joint Conference of the Association for Computing in the Humanities and the Association for Literary and Linguistic Computing, Glasgow)

A conference paper providing an overview of XSL with reference to applications in Humanities disciplines, particularly as concerns digital text encoding projects (such as digital libraries) and Humanities-oriented analytical text processing. [Downloadable .zip of HTML]

Practical Guide to SGML/XML Filters Introduction
Debbie Lapeyre

Introduction for Norman E. Smith’s book on SGML/XML Filters (Plano, TX: Wordware Publishing, Inc., 1997 [1st edition], 1998 [2d edition]), noting the value of SGML when combined with translation programs for output across various media, e.g., print, voice synthesis. As a prelude to the book’s discussion of several languages for SGML manipulation, the importance of such filters in the authoring context to enable creation of SGML from diverse sources, such as desktop publishing tools or spreadsheets, is likewise highlighted.

XML for SGMLers
Tommie Usdin and Debbie Lapeyre (XML’98, Chicago)

A one-day tutorial on the details of XML syntax and the differences between XML and SGML, highlighting the features, functionality, and “funkiness” XML excludes. Hands-on instruction includes emphasis on the changes necessary to convert SGML documents into well-formed XML, the conversion of SGML DTDs, and the trade-offs in various conversion approaches. [Downloadable .zip of PDF]

XML: Not a Silver Bullet, but a Great Pipe Wrench
Tommie Usdin and Tony Graham (ACM StandardView 6(3):125-132, 1998)

An article discussing the potential uses and benefits of XML, while questioning whether the excitement surrounding it has been fully merited.

Washington Technologies White Papers

Several early statements (1997) on the business case for SGML/XML. [Downloadable .zip of HTML]