IEEE TCDL Bulletin
 
space space

TCDL Bulletin
Current 2005
Volume 2   Issue 1

 

Processing XML Documents with Overlapping Hierarchies

Ionut E. Iacob and Alex Dekhtyar
Department of Computer Science
University of Kentucky
Lexington, KY, USA
{eiaco0, dekhtyar}@cs.uky.edu

 
The problem of overlapping markup hierarchies, first mentioned in the context of SGML, often occurs in XML text encoding applications for humanities. Previous solutions to the problem rely on manual maintenance of the markup and address only the problem of representing overlapping features in XML, leaving the issues of automated maintenance and querying open. As a consequence, traditional XML tools are of little practical use when dealing with overlapping markup. This work attempts to bridge the gap between the apparent necessity for concurrent markup and the lack of software support for it. We have formally defined multiple markup hierarchies and designed a framework for management of document-centric XML with overlapping markup, which is a generalization of a traditional XML processing framework: from representing multiple markup hierarchies, to parsing, querying, and authoring complex document-centric XML with overlapping markup hierarchies.

Thumbnail image of poster

For a larger view of the Figure, click here.

We propose an underlying model, data structures, APIs, and algorithms so that the most of the burden of managing concurrent XML hierarchies would be taken by the software.

Some of the advantages of the overlapping markup formalism we proposed are its storage flexibility and querying capabilities:

  • Overlapping markup can be imported into and exported from our framework from/in a wide range of alternative representations.
  • Querying the concurrent XML markup can be done by leveraging on the current XML query languages (XPath, XQuery).
 

© Copyright 2005 Ionut E. Iacob and Alex Dekhtyar
Some or all of these materials were previously published in the Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital libraries, ACM 1-58113-876-8/05/0006.

Top | Contents
Previous Article
Next Article
Home | E-mail the Editor