University of Illinois DLib Test Suite Quarterly Report for July-September 1999

Collection Growth

More than 2800 full-content articles were processed, indexed, and added to the testbed. This represents an approximately 6 % growth in testbed size during the 3rd quarter.

Mathematics Rendering

The mathematics for all publishers will now render reasonably well under MS IE 5, using a combination of real-time server-side scripting, Cascading Style Sheets (CSS), and client-side JavaScript.

One of the refinements in approach developed in the 3rd quarter was to migrate more of the dynamic positioning calculations from the server to the client. This should allow finer control over math display while at the same time reducing the burden on the server, improving the scalability of the solution. During the next quarter this solution will be fully implemented and validated using the testbed materials that have been provided by ASCE. If successful with ASCE materials, it will be migrated to the other publisher collections in the testbed during the 1st quarter of 2000.

An algorithm for capturing as discrete bit-maps the display of block equations as rendered by MS IE 5 was developed during the 3rd quarter. These bit-maps can then be supplied to clients connecting to the testbed with MS IE 4 or Netscape 4 in lieu of sending such clients the marked up block equations, which can’t be displayed well by these browsers due to limitations of the earlier implementations of the Document Object Model. This algorithm will be implemented in quarter 4 to enable more nearly equal quality rendering of testbed materials in the less capable browsers.

MathML is continuing to be investigated as a future solution to the math problem.

Metadata Enhancements

As part of the DOI-X effort (see also below) and in anticipation of TestSuite interoperability requirements, a major redesign of our metadata tagging structure and nomenclature was begun in the 3rd quarter. The redesign is an attempt to more closely conform to the RDF syntax and the latest Dublin Core enhancements, and to improve the readability, internal consistency, extensibility, and portability of our metadata format. The semantic updates are mostly complete. Scripts and documentation will be updated in quarter 4, and existing metadata files will be rebuilt to conform to the new schema.

DOI-X

In conjunction with AIP, we participated in the DOI-X project to explore viability of Digital Object Identifiers for use for linking between online articles published by major sci-tech publishers. We developed processes and programs to generate and register DOI metadata for our entire collection of AIP journal articles, which were then submitted for registration in the DOI-X repository. We contributed suggestions for improvements to the XML metadata DTD used to register DOIs and the batch lookup system used to find DOI links, and we did preliminary proof-of-concept work on how to incorporate DOIs into the testbed metadata schema. To support DOI linking into our testbed we also created and implemented an XSL stylesheet to transform and render object metadata. The XSL stylesheet, which is used conjunction with a CSS stylesheet for final formating, is implemented client-side if the user accesses our testbed with MS IE 5 and is implemented server-side if the user accesses our testbed using Netscape 4 or MS IE 4. Based on the success of this approach for DOI-X project use, this same approach will now be implemented for the rest of the metadata in our collection. The DOI-X project work will conclude in quarter 4, at which time a decision about project continuation in a production mode will be made.

Presentations, Visitors, and Outside Researchers

Partners Workshop

The annual partners’ workshop was held in Champaign on August 19-20. It was well attended by representatives from the American Institute of Physics (AIP), American Physical Society (APS), American Society of Civil Engineers (ASCE), ASM International, Association for Computing Machinery (ACM), Institution of Electrical Engineers (IEE), Naval Research Laboratory, Online Computer Library Center (OCLC), University of Chicago Press, Seagoat Consulting, and University of Illinois at Urbana-Champaign (UIUC).

We reported on the latest results of our research and development efforts, plus our ideas for other avenues of investigation. Everyone was very enthusiastic regarding our progress and the direction of our research. We also received invaluable feedback from our partners regarding directions they would like to see the research take.

Visitors

Engineers from AIP and APS

Several engineers and managers from both AIP and APS visited prior to the partners’ workshop for some in-depth technical discussions regarding our work.

Chicago State University

This was a group from the IT department of the CSU library who were here to learn more about Grainger Library’s IT infrastructure and special projects, such as DLI.

Delegation from Singapore

We presented a brief overview and demonstration of the project to two visiting researchers from Singapore.

Outside Researchers

Temporary access to the testbed has been granted to a researcher at the Graduate School of Library and Information Science at Keio University, Japan, for his research on a theme called "Implications of the non-semantic attributes of documents for IR".