Proposed XML Schemas for Dublin Core Metadata

NOTE: These schemas have no endorsement beyond that of the authors. These schemas (developed around 2002-07 by Tom Habing and Tim Cole) represent early explorations and prototypes for the schemas developed for Notes on the W3C XML Schemas for Qualified Dublin Core which should be considered the more definitive source.

Additional pages related to this work are: Examples Revisions DTD Proposal

Some points about these schemas, in no particular order:

Without further ado, the schemas and sample document instances:

dc1.1.xsd
This is the basic/simple root schema. It defines the core 15 elements, allowing only text values, with an optional xml:lang attribute. However, the complexType used for the elements (SimpleLiteral) is defined in such a way as to be maximally extensible. This schema also includes a convenience type which can be used to define a container element for the basic 15 elements. For example, an importing schema could use this type in a single statement to easily define a container element for which all the 1.1 element are included.
simple_dc_container.xsd
This schema imports the dc1.1.xsd and defines a root element which may contain any of the simple dc elements. The OAI namespace is used only for illustration. A sample document instance which uses this schema is also included.
 
dc1.1complex.xsd
This schema includes the dc1.1.xsd, but it adds several new complexTypes based off of the dc1.1.xsd SimpleLiteral type. These types must be indicated in an instance document by using the xsi:type attribute. They define several types which roughly correspond to the the RDF parseTypes of 'Literal' and 'Resource'. A type is also defined which uses XLink attributes to create a type that corresponds to an RDF property with a rdf:resource attribute. It can be used to link a DC element to a resource defined elsewhere, either internally or externally. The xlink.xsd schema was borrowed from the METS project.
complex_dc_container.xsd
This schema imports the dc1.1complex.xsd and defines a root element which may contain any of the simple dc elements. The OAI namespace is used only for illustration. A sample document instance which uses this schema is also included.
 
dc1.1complex_vcard.xsd
This schema includes the dc1.1complex.xsd, but it defines a vCard complexType based on the ComplexResource. If this type is specified, via xsi:type='dc:vCard' in an instance document, the DC element must contain only valid vCard subelements. The vCard XML schema we developed for this purpose is also available.
vcard_dc_container.xsd
This schema imports the dc1.1complex_vcard.xsd and defines a root element which may contain any of the simple dc elements. The OAI namespace is used only for illustration. A sample document instance which uses this schema is also included.
 
dcterms.xsd
This schema defines the qualified dublin core element refinements and encodings. This schema imports the dc1.1.xsd schema. It defines the various refinements as substitutionGroups of the appropriate imported base 15 DC elements. By using substitutionGroups, the schema is self documenting, but it also allows for enforcement of any constraints that might have been applied to the base 15 elements by an importing schema. For example, a schema that imports the dc1.1.xsd and the dcterms.xsd, but then goes on to disallow certain of the core 15 elements will also automatically disallow any of the corresponding refinement elements (see restricted_dc_container.xsd and test_restricted.xml).

The various qualified DC encodings are defined as complexTypes derived from the SimpleLiteral type. This means that they must be specified in instance documents by use of the xsi:type attribute. Otherwise, they are identical to the SimpleLiteral type -- allow only text content with an optional xml:lang attribute. However, as will be shown later, it is possible to easily restrict these encodings using XML data typing, regular expressions, or enumerations.
qualified_dc_container.xsd
This schema imports the dc1.1.xsd and dcterms.xsd and defines a root element which may contain any of the simple dc elements, or their refinements. The OAI namespace is used only for illustration. A sample document instance which uses this schema is also included.
 
dcterms_tight.xsd
This schema is identical to the dcterms.xsd except that (for illustration purposes) it restricts some of the encodings to only certain data types, specifically: W3CDTF is restricted to only valid XML dateTime values. DCMIType is restricted to only values enumerated in the dcmitype.xsd. URI are restricted to the XML anyURI type, the RFC1766 encoding is limited to the XML language type, the ISO639-2 encoding is restricted to values defined in iso639-2.xsd, and the Period encoding is restricted to only the strings which conform to the pattern given in dcmi-period.xsd. Given additional time and motivation it would be possible to restrict other of the encodings so as to only allow valid data types. This could be done via XML regular expressions, enumerations, or possibly other means. NOTE: The XSV validator currently will not validate all possible XML data types, specifically the dateTime type among others.
tightly_qualified_dc_container.xsd
This schema imports the dc1.1.xsd and dcterms_tight.xsd and defines a root element which may contain any of the simple dc elements, or their refinements. The OAI namespace is used only for illustration. A sample document instance which uses this schema is also included.

Dumb-down

Below are two XSLT stylesheets that can be used to dumb-down qualified DC refinement elements, as defined in our XML schema, into their equivalent simple DC elements. All other elements in the source XML are left unchanged.

The first, DCDD.xsl, isn't that interesting in that it just hardcodes the mappings from DCQ to DC elements in the stylesheet.

The second, DCDD2.xsl, is more interesting in that it actually uses the substitutionGroup attributes in the dcterms XML Schema to determine how to dumb-down the qualified elements. Theoretically, if additional refinements are added to the dcterms namespace, the dumb-down stylesheet would not require changes. Right now I am using the schema at http://homes.ukoln.ac.uk/~lispj/boston/boston3/dcterms.xsd, but this would be changed to where ever the final home for the stylesheet is located.

Both stylesheets currently throw out any unknown elements from the dcterms namespace. They also both leave the xsi:type attributes, even on the dumb-downed elements. Both of these behaviors could be easily changed.


Tom Habing, 2002-07