XML Training: Concepts
Document Analysis for XML/SGML
(1-day lecture, with exercises)
Analysis is the key to success in developing XML/SGML applications just as it is in relational databases, content management systems, or other structured information management environments. Analysis of text involves special problems, and there are some well-understood techniques for doing document analysis that newcomers to structured information may not know.
This course begins with basic concepts of structured markup and a grounding in what to look for when analyzing documents for a DTD or schema. Instructors then cover approaches, simple techniques, and hands-on work with common design dilemmas, with examples of markup for search precision and content reuse. Key topics include:
- Principles of structured information/document analysis
- Who to involve in information/document analysis
- Information analysis process
- Information models and notations
- Designing to support search and retrieval
- Designing to support print and presentation
- Designing for affordable implementation
Central to the course is participants’ analysis of a complex document. Participants will use a simple methodology that allows them to analyze, discuss, and record complex relationships without DTD or Schema syntax. Enough detail on grouping, sequencing, and occurrence constructs is provided that participants understand what constraints can be expressed in XML or SGML; however, no prior XML or SGML syntax knowledge is assumed, nor is it assumed that participants will ever need to read or write a DTD or schema.
Prerequisites: None.