University of Illinois DLib Test Suite Quarterly Report for October-December 1999

Collection Growth

More than 2200 full-content articles were processed, indexed, and added to the testbed. This represents an approximately 6 % growth in testbed size during the 4th quarter. Because of changes to their DTD and document processing flow, no new articles have been received from the American Society of Civil Engineer’s (ASCE) since 1998. However, this quarter we began to receive some samples of their new document structure. Once we have updated our processing flow to accommodate these, we anticipate adding the entire year’s backlog of ASCE materials to the testbed, hopefully, in early 2000. This should add considerably to the testbed.

Mathematics Rendering

We have continued to make incremental improvements to the mathematics rendering for Microsoft Internet Explorer version 5 (IE5), using server and client-side scripting and cascading style sheets (CSS). We have also nearly completed the conversion of all the publishers’ marked up mathematics to JPEG images for rendering in older, less capable browsers. This should be completed in the 1st quarter of 2000 at which time we will move it into our production system and phase out support for native SGML rendering using the SoftQuad Panorama browser plugin.

We plan to continue work on improvements to the math rendering; however, we continue to investigate MathML as the eventual best solution to rendering math, and in 2000 plan to pursue conversion of the marked up math in our current collection to MathML.

Metadata Enhancements

We have completed the redesign of our metadata schema, and we have begun testing new scripts to convert our old schema to the new schema. These scripts should be tested and working for our entire collection early in 2000 at which time we will switch entirely to the new schema for all of our document processing, searching, and extended citation display.

We also assisted in the development of an XML DTD to be used by the all the DLib projects for documenting their various metadata formats according to the ISO 11179-3 standard. We have documented our metadata using this DTD, and we have also developed an XSL stylesheet for the display of these files.

XSLT

We are investigating the incorporation of the Extensible Stylesheet Language Transformations (XSLT) into more of our processing steps. Since XSLT is a W3C standard, this should increase the portability of our solutions.

We are currently utilizing XSLT for the rendering of extended citations from the metadata files for all of our publishing partners.

We will also be making extensive use of XSLT and the XML Document Object Model (DOM) during our new metadata generation process.

In addition we are exploring the use of the DOM and XSLT for our dynamic processing that converts the raw article XML files into renderable form.

Presentations, Visitors, and Outside Researchers

Visitors

NTT Learning Systems Corporation from Japan

A group of five managers and staff from their Internet Department visited us for three days for presentations and discussions regarding potential participation in our partners program. NTT Learning Systems Corporation is developing an online technical journal system, similar to our testbed, for the government of Japan.

ASM International

We are in negotiations with ASM International, a professional society for materials engineers, to add them to our partnership program and possibly include some of their Metals Handbook material in the testbed.