Many managers of XML content struggle with the degree and cost of structural and semantic tagging for their content. A traditional relational database with its columns and rows makes taking a library of content from one state of enrichment to the next difficult – often requiring new columns be added and all content be advanced to the new level, which can be cost prohibitive.
One of the great features of MarkLogic, which we expose in via our module to Drupal, is the concept of targeted Content Enrichment. The figure below demonstrates some of the stages of content enrichment an XML document can attain:

This enrichment staging concept allows for several major features:
· The belief that content is either unstructured or archive quality structure is dispelled
· Highly desired, more valuable content may warrant the investment in higher-stages while lesser used content remains in a lower stage.
o This can be aligned with current human resource, budget and/or projected ROI for that content
· The system can operate with content at all stages, simply providing different views of different content based on its stage of enrichment
As a direct example of the second bullet, a math formula can be represented as either a picture or using MathML – the requirements for display, discovery and use drive the substantial additional effort in MathML.
VML’s use of both MarkLogic Server and Drupal supports this publisher objective.
- Craig Miller's blog
- Login to post comments
