Combined Comments on Draft v.7 of METS Profile 2006-10-12 ==================================================================== 1. There needs to be some explanation of the purpose of the objects that use the profile, partly to explain the emphasis on preservation metadata and to explain the purpose behind keeping all versions of the structMap and descriptive metadata. What are the practical implications of keeping track of the transformations done to the metadata? Will institutions really be able to keep track as envisioned? --- TGH: I think what this question is trying to get at is why anyone would use this profile or to what purpose. I tried to address this question very briefly in the abstract, but maybe more detail is needed. The answer is essentially that this profile is not concerned with rendering or making accessible any particular representation of an object, but it is concerned with preserving the object and its representations, including the history of how those have changed over periods of time. This profile defines a representation as being a combination of the descriptive metadata, files, structural maps, structural map links, and behaviors. In general the profile is agnostic about almost all of these parts of the representation. We have made some pragmatic concessions, such as mandating at least MODS for the dmdSec, but otherwise don't have many requirements for these sections. However, this profile is very prescriptive when it comes to administrative metadata which can be associated with almost all of the sections that make up the representation, particularly technical and provenance metadata because we feel that those are important to preservation. Regarding support for multiple alternate versions of the descriptive metadata and structural maps (and the content files also): this profile accomodates this for cases where it is useful, but it does not mandate it. The tools that we are developing (the Hub and Spoke) will take advantage of this feature especially for the descriptive metadata which our tool may transform in significant ways. Just as you would probably not delete a source content file after doing a format migration to a more 'preservable' format (say from TIF to JPEG2000), we also assume that you will not delete descriptive metadata or a structural map just because you have migrated to a new format or version (say MARC to MODS or even from MODS to MODS with a new revised abstract). It will be up to a particular system developer and/or collection curator to what extent to implement this. In general, you wouldn't want to preserve an old version of the descriptive metadata every time a new subject term was added or a spelling mistake was corrected, but if the descriptive metadata was substantially rewritten because of new scholarly discoveries about the object, this probably would require preserving the old metadata along with a provenance statement describing the rational for the changes in the new metadata. Part of the assumptions of this profile is that the objects being packaged are in some sort of "preservation state" -- a deliberate decision was made by some agent to put the object into that state, the object isn't expected to be undergoing not undergoing frequent changes --- Resolved: I've used some of the above text to attempt to clarify the purpose of the profile, both in the abstract and other sections. -------------------------------------------------------------------- 2. What is the notion of a "first class preservation object"? It implies that some objects might be in the category of "second class", which I think isn't really the case. In your description rules, you identify the areas of the METS document that are considered first class preservation objects, and administrative metadata is not included. You are aware that the OAIS model requires that information in your representation network must be treated as a first class preservation object, yes? So, you are formally defining yourself out of compliance with OAIS here, so long as you're not committed to preserving technical metadata in the amdSec. --- TGH: I agree the notion of first-class, and by extension lower- class, needs to be excised. I think what I was trying to differentiate with this were the elements that comprise the 'representation' from those elements which are administrative metadata about the representation or parts thereof. I think this distinction is still useful, but I agree that all parts are equally important, so will try to clean up the document to be rid of the idea of 'first-class.' --- Resolved: The use of the term 'first-class' has been removed from the profile. -------------------------------------------------------------------- 3. The structMap doesn't give the creator of a METS document conforming to this profile too much to go on. There are not real requirements. Since the structMap is the heart of the METS document, the user of the profile might need more guidance, since basically it says that you can have as many as you want and there are all these different options. --- TGH: This was deliberate (with the exception of web captures). Since our focus is on preservation and not access or rendering and we are agnostic about the types of digital objects which can be represented by this profile for preservation, we cannot be any more prescriptive about structural maps than we can be for content files. If a digital object representation requires the Indiana METS Navigator that would be the structural map it should use; if it requires the some other structural map or multiple that is fine too. At some future point a new and better page turning application may emerge that requires a new structural map. Now the METS document would contain both the old structural map and the new with appropriate technical and provenance metadata, so that a future curator could make informed decisions about the object and its possible representations. Our profile is not so concerned with the actual structural map, but with how that stuctural map can be described for preservation, thus unlike many other profiles, we recommend that structural maps have associated administrative metadata, both technical and provenance, to facilitate the long term preservation of the structural map, regardless of what the structure actually is. --- Resolved: Tried to clarify the language in the profile to address this issue. Also, see #1 above. --------------------------------------------------------------------- 4. There is reference to this as a top level profile, but it doesn't explain how that will work with other profiles or what that means. We did have the notion at one time of a "vanilla" profile, and that we would have subprofiles (that noone has ever done before although it's theoretically possible-- except for the fact that the profile attribute says what profile the METS document conforms to, but you can't repeat that for subprofiles). I don't see any evidence on the Echodep website any more of this notion of subprofiles. Where are we going with it? --- TGH: The top-level profile is left-over from some earlier ideas. I think we have pretty much abandoned the idea of a hierarchy of profiles though, so this should probably be dropped. --- Resolved: We are not going to abandon this idea, but we need to better emphasize it. We are now calling our 'top-level' profile a 'generic' profile. To emphasis how this is a generic profile, we extracted the requirements about web captures and putting them into their own profile which references the generic profile. We are hoping that this will better emphasize the preservation and interoperability qualities of the generic profile without confusing the issues by also including language about web capture. The new web capture profile references the generic profile, only describing requirements that go beyond or override the requiorements of the generic profile. We anticipate that additional second-level profiles that reference the generic profile will be added over time. --------------------------------------------------------------------- 5. On linking and embedding: Frankly, if you're specifying an upper limit of 40 MB on a METS document, you might as well completely outlaw embedding content files; I've dealt with text files that large. Any preservation quality master image will probably break that limit. I'm assuming you didn't outlaw it outright because you could manage instances where people might want the flexibility. If you're going to allow that, I don't think you even really need the linking vs. embedding section. --- TGH: I think we want to support both, but encourage embedded metadata and referenced content files. But as Jerry says, if we allow both we must be prepared to deal with both, so I agree that the section describing the 40 MB rule probably isn't needed. Technically we will be able to deal very large METS files, even if it is not the most efficient. --- Resolved: The 40MB rules has been removed, but we still recommend embedding metadata and linking to content, but either can be used. --------------------------------------------------------------------- 6. Are there ever circumstances in which you would allow for metadata to be deleted? If so, do you need METADATA_DELETION, STRUCTMAP_DELETION, etc. in your controlled vocabularies? --- TGH: Good question. This being a preservation profile, deletions should be discouraged, but there are always exceptions :-). Assuming that descriptive metadata has been deleted, from where would you attach the digiprovMD saying that it was deleted, or maybe it is not really deleted, but it is just marked as deleted by the provenance event. Another option would be to have an empty dmdSec or structMap section to reference the digiprovMD from. In general, I want to avoid dangling amdSec sections -- every amdSec should be referenced from some other section. Anyone else have any ideas about this? --- Resolved: Added METADATA_DELETION and STRUCTMAP_DELETION to the vocabulary and also added text decribing how deletions, including deletions of file content, should be handled by the profile. The profile supports both marking items as deleted but still keeping them in the package, and also marking them as deleted and really deleting them from the package. --------------------------------------------------------------------- 7. The STRUCTMAP_TRANSFORMATION vs. STRUCTMAP_MODIFICATION distinction is only going to spawn confusion and debate and I don't see how it really helps anyone. I'd get rid of it. --- TGH: I agree that this is confusing. I was trying to parallel the values that we use for metadata where I also differentiate between TRANSFORMATION and MODIFICATION where the primary difference is that a transformation changes the metadata format, but a modification does not. With stuctural maps almost any change (moving, inserting, or deleting div elements) changes the format of the structmap, so maybe it is a useless distinction. I suppose you could come up with some sort of language that makes it meaningful though. For example, a transformation could be defined as a change that makes the structMap incompatible with systems that were designed to process the source structMap, and a modification to a structMap does not effect the structMap's ability to be processed by systems designed to work with the source structMap. I think this definition still pretty much parallels the usage of the terms with metadata. Comments from anyone else? --- Resolved: We will keep this distinction, but better define it to mean that changes which break compatability with previous structMap processing are to be considered transfomations, but changes which maintain backward compatability with previous processing systems are modifications. --------------------------------------------------------------------- 8. Requiring conformance to an Acquifer profile which doesn't yet exist makes me nervous. --- TGH: Me too, but we are hopeful that something will be there soon. Maybe until that point we should drop the reference and just mandate MODS in general, and once the Aquifer profile is 'official' we can revise the profile. Anyone else have strong feelings on this? --- Resolved: We feel that the Aquifer profile is near enough to completion that we feel comfortable referencing it in this profile.